Skip to content

Commit e42906d

Browse files
authored
Add post announcing the EOSS 6 award
2 parents 5b40d9b + 90da942 commit e42906d

File tree

1 file changed

+161
-0
lines changed

1 file changed

+161
-0
lines changed

content/blog/eoss6_award.md

+161
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
+++
2+
date = "2024-11-11T08:00:00+00:00"
3+
author = "Athan Reines"
4+
title = "CZI EOSS 6 Award to Advance Array Interoperability within the PyData Ecosystem"
5+
tags = ["APIs", "standard", "consortium", "arrays", "community", "funding", "czi", "eoss6"]
6+
categories = ["Consortium", "Standardization"]
7+
description = "The Chan Zuckerberg Initiative (CZI) awarded an EOSS Cycle 6 grant to the Data APIs Consortium to advance array interoperability within the PyData ecosystem."
8+
draft = false
9+
weight = 30
10+
+++
11+
12+
We are thrilled to announce that the Chan Zuckerberg Initiative (CZI) recently
13+
awarded the Consortium for Python Data API Standards an Essential Open Source
14+
Software for Science(EOSS) Cycle 6 grant to support ongoing work within the
15+
Consortium and to accelerate the adoption of the Array API Standard across the
16+
PyData ecosystem. With this award, we'll drive forward our vision of
17+
standardizing a universal API for array operations, enhancing library
18+
interoperability, and increasing accessibility to high-performance
19+
computational resources across scientific domains.
20+
21+
## The Importance of the EOSS Program
22+
23+
The EOSS program by CZI was launched to support open source software that is
24+
foundational for scientific research, especially within biology and medicine.
25+
As software tools underpin modern scientific investigation, ensuring these
26+
tools receive adequate funding is crucial for sustainable growth and long-term
27+
impact. Through EOSS, CZI has committed to funding development, usability
28+
improvements, community engagement, and maintenance efforts for critical open
29+
source tools. This support enables open source software to be more accessible,
30+
reliable, and adaptable to researchers' evolving needs.
31+
32+
With the EOSS Cycle 6 award, Quansight, in cooperation with collaborators within
33+
the Consortium and the broader ecosystem, will focus on advancing
34+
interoperability, improving ease of Array API adoption, and reducing array
35+
library fragmentation within the PyData ecosystem.
36+
37+
## Addressing Fragmentation in the PyData Ecosystem
38+
39+
As Python's popularity has grown, so has the number of frameworks and libraries
40+
for numerical computing, data science, and machine learning. Researchers and
41+
data science practitioners now have access to a vast suite of tools and
42+
libraries for computation, but this diversity comes with the challenge of
43+
fragmented APIs for fundamental data structures such as multidimensional
44+
arrays. While array libraries largely follow similar paradigms, their API
45+
differences present a real challenge for users who need to switch between or
46+
integrate multiple libraries in their workflows.
47+
48+
The Consortium for Python Data API Standards, founded in 2020, addresses this
49+
issue directly. By standardizing a universal array API, the Consortium seeks to
50+
simplify the process for users moving between libraries and foster an ecosystem
51+
where array operations are seamless across libraries such as NumPy, CuPy,
52+
PyTorch, and JAX. To date, the Array API Standard has seen adoption by major
53+
libraries, laying the groundwork for an interoperable PyData ecosystem that
54+
emphasizes compatibility and ease of use.
55+
56+
If you're curious to learn more about the Consortium, its origins, and the
57+
benefits of standardization, be sure to read our 2023 SciPy Proceedings paper
58+
["Python Array API Standard: Toward Array Interoperability in the Scientific
59+
Python Ecosystem"](https://proceedings.scipy.org/articles/gerudo-f2bc6f59-001).
60+
61+
## Scope of Work for the EOSS 6 Award
62+
63+
The EOSS 6 award will help the Consortium focus on key initiatives to expand
64+
adoption and improve compatibility across the ecosystem. The proposed work
65+
includes:
66+
67+
### Array API Adoption in Downstream Libraries
68+
69+
One of our primary goals is to further adoption of the Array API Standard in
70+
downstream libraries, such as SciPy, scikit-learn, and scikit-image.
71+
Historically, many downstream libraries have been dependent on NumPy, thus
72+
limiting their execution model to CPU-bound computation and thus their ability
73+
to leverage the performance advantages of GPU- or TPU-based computation. By
74+
adopting the Standard, downstream libraries will be able to support array
75+
libraries such as CuPy and PyTorch, empowering researchers to take advantage of
76+
the hardware acceleration options suitable to their needs.
77+
78+
### Infrastructure for Adoption and Compliance Tracking
79+
80+
We're also committed to building infrastructure to monitor compliance and
81+
adoption of the Array API across the ecosystem. While we have already developed
82+
a [test suite](https://github.com/data-apis/array-api-tests) to measure
83+
compliance for array libraries, this tool has been largely developer-facing,
84+
leaving end users with limited visibility into compatibility across different
85+
libraries. To address this gap, we will create public mechanisms, such as
86+
compatibility tables, for tracking which libraries are adopting the Standard
87+
and helping users make informed decisions about which libraries to use.
88+
89+
Additionally, we plan to develop mechanisms for automating compliance tracking
90+
within array library continuous integration (CI) workflows, allowing real-time
91+
monitoring of adoption and compatibility regressions. This infrastructure will
92+
hopefully instill greater confidence among end users in array library
93+
compatibility and help array library developers maintain interoperability.
94+
95+
### Comprehensive Documentation and Migration Guides
96+
97+
As adoption grows, we recognize the need for high-quality documentation and
98+
migration guides to help users and developers transition seamlessly to using
99+
the Array API Standard. Through our collaborations with library maintainers,
100+
we've gathered insights into best practices for building array library-agnostic
101+
applications. With EOSS 6 funding, we'll transform these insights into
102+
tutorials, case studies, and migration guides to facilitate adoption among
103+
downstream libraries. By offering clear and accessible resources, we aim to
104+
reduce the learning curve for new users and provide developers with the tools
105+
they need to confidently build array library-agnostic applications.
106+
107+
## Value to the Scientific Community and End Users
108+
109+
The work funded by this award will provide significant benefits to users within
110+
the scientific research community. Our hope is that this work will yield three
111+
primary outcomes:
112+
113+
1. **Interoperability Across Libraries**: Fragmentation within the ecosystem has
114+
often led to duplication of effort, limited access to hardware acceleration,
115+
and the need for repeated re-implementation of foundational array structures.
116+
By fostering interoperability across libraries, we aim to simplify the process
117+
of moving between technical stacks and unlock new performance gains for array
118+
library consumers.
119+
120+
2. **Standardization and Reduced Switching Costs**: Users will benefit from
121+
shorter learning curves and lower costs associated with switching libraries.
122+
With standardized APIs and robust compliance infrastructure, users will have
123+
greater confidence that their workflows will be portable across array
124+
libraries, regardless of the underlying computational backend.
125+
126+
3. **Enhanced Performance for Array-Consuming Libraries**: Array API adoption
127+
has [already shown](https://proceedings.scipy.org/articles/gerudo-f2bc6f59-001)
128+
promising performance improvements across several libraries in the ecosystem.
129+
For example, performance gains of up to 50x in SciPy and 10-40x in scikit-learn
130+
were observed upon integrating support for alternative array libraries such as
131+
CuPy and PyTorch. We hope to observe similar acceleration in other downstream
132+
libraries, which could dramatically reduce analysis time for computationally
133+
intensive research tasks, ultimately improving efficiency and access for users
134+
working with high-dimensional data.
135+
136+
## Looking Forward
137+
138+
As we embark on this phase of our work, we're excited to continue pushing
139+
forward the Array API Standard as a unifying foundation for the PyData
140+
ecosystem. Support from CZI's EOSS program is instrumental in making this
141+
vision a reality, and we're committed to expanding the impact of the Array API
142+
Standard through real-world applications and community engagement.
143+
144+
With this award, we're not only addressing technical fragmentation but also
145+
advancing a more inclusive, accessible, and robust future for scientific
146+
computing. We look forward to collaborating with the community to make array
147+
interoperability a reality across the ecosystem and to empower researchers with
148+
tools that help them achieve scientific breakthroughs more efficiently and
149+
effectively.
150+
151+
Stay tuned for updates as we implement these initiatives and continue to
152+
strengthen the foundations of the PyData ecosystem!
153+
154+
---
155+
156+
## Funding Acknowledgment
157+
158+
This project has been made possible in part by grant number EOSS6-0000000621
159+
from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley
160+
Community Foundation. Athan Reines is the grant's principal investigator and
161+
Quansight Labs is the entity receiving and executing on the grant.

0 commit comments

Comments
 (0)