Sebastian Galkin
@functionth.bsky.social
๐ค 20
๐ฅ 34
๐ 6
So happy with this milestone. Lots of work went into this one!
add a skeleton here at some point
3 months ago
0
1
0
reposted by
Sebastian Galkin
Earthmover
5 months ago
Our latest fundamentals blog post provides an overview of
@zarr.dev
and its open-source ecosystem. Read more:
earthmover.io/blog/what-is...
loading . . .
Fundamentals: What Is Zarr? A Cloud-Native Format for Tensor Data - Earthmover
What Zarr is, and how it enables fast, scalable access to multidimensional array data in the cloud.
https://earthmover.io/blog/what-is-zarr
0
10
3
reposted by
Sebastian Galkin
Earthmover
5 months ago
๐ป๐๐ค ๐๐๐๐ ๐ผ๐๐๐โ๐ข๐๐ ๐๐ฃ๐๐๐ ๐๐๐๐ข๐๐๐๐๐ก ๐ ๐ก๐๐๐๐๐ ๐๐๐ก๐ค๐๐๐ ๐๐๐ก๐ ๐ฃ๐๐๐ ๐๐๐๐ ? Icechunk stores only new or changed chunks for each version โno redundant copies or rewrites. You get instant time travel, branching, and efficient updates, all with negligible storage overhead. More:
bit.ly/3F1XFST
loading . . .
Icechunk: Efficient storage of versioned array data - Earthmover
We recently got an interesting question in Icechunkโs community Slack channel (thank you Iury Simoes-Sousa for motivating this post): Iโm new to Icechunk. How is the storage managed for redundant info...
https://earthmover.io/blog/icechunk-efficient-storage-of-versioned-array-data
0
3
4
reposted by
Sebastian Galkin
Earthmover
5 months ago
Our latest blog post dives into the chaos of the status quo - where every tweak means regeneratingย the ๐คโ๐๐๐ ๐๐๐ก๐๐ ๐๐กย and collaboration and experimentation is often stifled by silos and secret knowledge. Check out the full post:
earthmover.io/blog/tensoro...
loading . . .
TensorOps: Scientific Data Doesn't Have to Hurt - Earthmover
Curious how your team scores on the "Data Pain Survey"? Wondering why your teams are building Rube Goldberg machines just to put some data on a map? Or just want to see our plan to bring order to your...
https://earthmover.io/blog/tensorops-scientific-data-doesnt-have-to-hurt
0
3
4
After months of Rust, I wrote some Python this weekend. I immediately got burned by global mutable state
5 months ago
0
7
0
Last week
@deepakcherian.bsky.social
gave a fascinating talk at NCAR on data sharing and open-data. The historic perspective, the achievements and failures past and present, how to learn and move forward to fulfill the promises. Remarkable and illuminating
www.youtube.com/watch?v=JZT3...
loading . . .
CISL Seminar: Deepak Cherian (Earthmover)
YouTube video by NCAR Computational and Information Systems Laboratory (CISL)
https://www.youtube.com/watch?v=JZT3rS7vOtg
5 months ago
0
1
0
Had the idea of using Icechunk (an multi-dimensional array database) for something I would never use Icechunk for
add a skeleton here at some point
5 months ago
0
0
0
reposted by
Sebastian Galkin
Earthmover
6 months ago
1/ ๐กย Our latest blog post in the fundamentals series, written by
@tegnicholas.bsky.social
, demystifies cloud-optimized scientific data formats! Read more:
earthmover.io/blog/fundame...
loading . . .
Fundamentals: What is Cloud-Optimized Scientific Data?
What cloud-optimized data really means, and how Zarr and Icechunk enable fast access to massive scientific datasets in cloud object storage.
https://earthmover.io/blog/fundamentals-what-is-cloud-optimized-scientific-data
2
16
12
reposted by
Sebastian Galkin
TEGNicholas.bsky.social
6 months ago
You could also do this for arbitrarily large scientific array datasets using Xarray + Icechunk + R2/Tigris
juhache.substack.com/p/0-data-dis...
loading . . .
0$ Data Distribution
Ju Data Engineering Weekly - Ep 78
https://juhache.substack.com/p/0-data-distribution
0
0
1
reposted by
Sebastian Galkin
Earthmover
6 months ago
๐ฃย Blog post alert! ๐๐ฑ๐ฉ๐ฅ๐จ๐ซ๐ข๐ง๐ ๐๐๐๐๐ก๐ฎ๐ง๐ค ๐ฌ๐๐๐ฅ๐๐๐ข๐ฅ๐ข๐ญ๐ฒ: ๐ฎ๐ง๐ญ๐๐ง๐ ๐ฅ๐ข๐ง๐ ๐๐'๐ฌ ๐ฉ๐ซ๐๐๐ข๐ฑ ๐ฌ๐ญ๐จ๐ซ๐ฒ. This technical post by
@functionth.bsky.social
dives deep into the internals of how S3 shards data, showing that distributed Icechunk can easily perform 230,000 object reads/sec and beyond.
earthmover.io/blog/explori...
loading . . .
Exploring Icechunk scalability: untangling S3's prefix story | Earthmover
We show Icechunk can scale to extremely high concurrency levels, and explain how it achieves this in modern object stores.
https://earthmover.io/blog/exploring-icechunk-scalability
2
5
7
reposted by
Sebastian Galkin
Joe Hamman
6 months ago
We often see folks try to convince tabular data tools to perform well with multi-dimensional array data. This post by
@rabernat.bsky.social
explains, from first principles, why this rarely works. Its a good one! ๐๐๐
add a skeleton here at some point
1
3
1
I've worked on Icechunk almost exclusively for the last six months. I'm very proud of the result; you should check it out.
add a skeleton here at some point
6 months ago
0
3
0
reposted by
Sebastian Galkin
Earthmover
8 months ago
1/ Check out our latest blog post
earthmover.io/blog/xarray-...
to learn about the dramatic improvement and performance of Xarrayโs Zarr backend. We achieved improved the โtime to first byteโ metric, building on Zarr-Pythonโs new asyncio internals.
loading . . .
Accelerating Xarray with Zarr-Python 3 | Earthmover
We have recently dramatically improved the performance of Xarrayโs Zarr backend. This post explores how weโve improved the โtime to first byteโ metric, building on Zarr-Pythonโs new asyncio internals.
https://earthmover.io/blog/xarray-open-zarr-improvements
1
4
7
you reached the end!!
feeds!
log in