Skip to content

Publish smaller packages #10636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ggetz opened this issue Aug 4, 2022 · 22 comments
Open

Publish smaller packages #10636

ggetz opened this issue Aug 4, 2022 · 22 comments

Comments

@ggetz
Copy link
Contributor

ggetz commented Aug 4, 2022

CesiumJS is an extensive library with a lot of functionality that, depending on the app or use case, goes unused. Installing the cesium npm package also syncs third party libraries which may go unused. All features are tied to the monthly Cesium versioning, whether or not anything has changed.

Many users should continue to use the "traditional" CesiumJS releases, but there are also many use cases that would benefit from ingesting smaller, more specific packages.

See this community forum thread for some seeds of this discussion.

  • Maintain the traditional combined release as before. There may be some internal refactoring, but from an end user standpoint, the cesium npm package and release would remain as-is
  • Maintain all code from within the same repo. We don't want to introduce additional development burden with additional repos or submodules
    • workspaces are built into npm for use cases such as this. It allows dependency management to work similar to before with all modules installed to one top-level node_modules directory
    • All packages would continue to share the same tooling
  • Publish individual packages along architectural boundaries. In theory, there should be nothing stopping us from breaking up and publishing code as-is with some additional configuration, however due to some interconnections in the architecture, this may require some refactoring so that we are not "cheating" by using private APIs.
    • Each new package would be able to be published individually
    • Each new package would only require relevant dependencies-- This means dependencies would only be synced if they are strictly needed, rather than the approach we take now where all dependencies are installed regardless
    • Each new package would follow semver, while traditional CesiumJS release could remain with its monthly versioning strategy
  • Most major JS projects implement a variation of this (React, Babel, BabylonJS, Angular…)
  • While not needed for an initial implementation, eventually, this could allow us take advantage of tooling for dependency graphs and more, allowing us to do things like run only the unit tests relevant to a change

A few use cases:

  • Widgets
    • CesiumViewer and other Widgets work well for simple use cases that want to get something really basic up quickly. But a more sophisticated app does not need them, and they can actively be a hindrance as they bring in many static assets.
    • Separating widgets would isolate the dependency on knockout, which is generally incompatible with other UI/Component stacks, and itself can be a security issue. Notably, the basic CesiumWidget does not rely on knockout.
  • Core/Math
  • KML, GeoJson/TopoJson, and Google Earth support each rely on specific dependencies that are not used elsewhere in the API
@sanjeetsuhag
Copy link
Contributor

Another benefit this may have is that it would allow us to upgrade our testing tools. We could take advantage of something like Jest, which could run tests in parallel. For stuff that needs to be run in a browser, we could use Playwright or Puppeteer and run tests in all browsers - potentially allowing us to do screenshot testing for things like Sandcastle.

@ggetz
Copy link
Contributor Author

ggetz commented Aug 29, 2022

For the first iteration of this, we'll focus on creating a separate non-widget, engine-only package and a widgets package dependent on the former. This will allow us to set up the repo to support publishing multiple modules while focusing on one set. Separating out Widgets isolates the knockout and nosleep dependencies, as well as minimize and simplify build configuration for those not interested in using the knockout-flavored CesiumJS widgets.

Packages (exact naming TBD)

graph TD;
    cesium/widgets-->cesium/engine
    cesium-->cesium/engine
    cesium-->cesium/widgets
    cesium/engine-->deps[other npm dependencies]
    cesium/widgets-->knockout
    cesium/widgets-->nosleep
Loading

New packages should be added as an npm workspace and have their own package.json file. Many projects use a top-level packages folder, but that is not a hard requirement and we should consider what naming makes sense for CesiumJS.

packages
| engine
| | src
| | | Core
| | | DataSources
| | | ...
| | package.json
| | README.md
| | LICENSE.md
| | ...
| widgets
| | src
| | | ...
| | package.json
| | README.md
| | LICENSE.md
| | ...

Some things we'll need to consider for each package:

  • Naming - Make sure to run this by the community (and that it doesn't conflict with any other existing packages)
  • Entry points - Shipping just the ESM will require a bundler or other build tooling, but will allow for tree shaking and other optimizations, while the upstream "traditional" build can maintain the built Cesium.js.
  • Each package should include typescript definitions
  • Documentation - I think we want to stick with our original tooling for generating the API documentation, and continue to generate globally. (This is what Babylon does for instance)
  • Testing, linting - These can be done either globally for the entire project or centrally for each package, but it should continue to use the same tooling for now. As @sanjeetsuhag mentioned, as we get more sophisticated with our use, we could reconsider tooling and get a bit more targeted. But for now, let's keep the scope small.
  • LICENSE.md matching the limited dependencies for each package.
  • README.md should include a minimal example and document best practices (for example in Bablyon.js).

@ggetz ggetz moved this from Next priority to In Progress in CesiumJS Issue/PR backlog Aug 30, 2022
@ptrgags
Copy link
Contributor

ptrgags commented Sep 6, 2022

@ggetz This is exciting to see! Is my understanding correct on these details?

  1. Code from Scene or Renderer would end up in cesium/engine, at least for this first iteration
  2. Future iterations might further divide cesium/engine into smaller packages, perhaps separating graphics/geospatial math from rendering details

@ggetz
Copy link
Contributor Author

ggetz commented Sep 6, 2022

Code from Scene or Renderer would end up in cesium/engine, at least for this first iteration

Yes. I think it would be cleaner to move source files directly into this directory structure rather than having scripts copy or reference code from Source, but let me know if that's a concern.

Future iterations might further divide cesium/engine into smaller packages, perhaps separating graphics/geospatial math from rendering details

Yes, and this would be split along architectural boundaries such that units of code which require additional dependencies are separated into packages. To do so, we would increment a major version to denote the breaking change of modules moving to a separate package, and potentially cesium/engine would then depend on these new packages (for instance, if we publish cesium/core).

@bampakoa
Copy link
Contributor

bampakoa commented Sep 7, 2022

@ggetz the initial plan looks great 💯

It may be a bit too early for this but maybe it would make sense to publish the new packages under the @cesium namespace. However, this would introduce breaking changes so I do not know if it is something that could be done atm.

Regarding the testing/linting process, I think it would be better to have tests alongside the code of each package so that it will be easier to find it.

Thanks for the effort 🙏

@mramato
Copy link
Contributor

mramato commented Sep 12, 2022

I agree that switching to a @cesium namespace would be a good call as part of this work.

I'm definitely on-board with the overall intent of the plan, but I wonder if we have actually done any small prototypes (either hacked up Cesium or a small POC complete separate from Cesium) of how this might work. I feel like there could be a lot of potential showstoppers (like JSDoc, TS generation) hiding here that will require some significant work. A small prototype will help expose those problems up front.

nosleep

Do we even need this? Cesium's VR support is ancient and I believe that's the only place this is used. I would definitely encourage us to take a second look at dependencies like this and figure out if the easier solution is to just delete code or remove a feature. How many people are actually using the Cesium VR support that exists today? (and is nosleep even needed for it).

cesium/engine

This alone definitely feels a bit too coarse for me. @kring might have some thoughts here but to me it makes sense that the general rule should be "new dependency, new module" so those geojson/kml/google-earth etc.. modules make a ton of sense for keeping things lean and mean.

@ggetz overall this is a great write-up and looking forward to seeing some proof-of-concepts.

@mramato
Copy link
Contributor

mramato commented Sep 12, 2022

One other thing that comes to mind is public vs private APIs. Right now we have some bad practices where there is a lot of private API usage of the Cesium library. Would moving to modules have the rule that in order for one Cesium module to use the API of another, it has to be public? That seems the only way to dogfood things properly and make a good product. For example, I think even some low level utilities like Check are private, and therefore would need to be promoted to public to be used in "widgets'

@kring
Copy link
Member

kring commented Sep 12, 2022

@kring might have some thoughts here

I don't have too much to add. Since you're planning to stick with one repo (makes sense to me), there's not a lot of cost to extra libraries, so generally going "too fine" is better than "too coarse" IMO. For example, this long-ish list of libraries in cesium-native:
https://github.com/CesiumGS/cesium-native#card_file_boxlibraries-overview

(and that's even though CesiumJS does a lot that cesium-native doesn't, like all the GeoJSON / KML / Google Earth stuff Matt mentioned doesn't exist at all in native)

But it also seems reasonable to me to start by breaking out some coarse pieces - even as coarse as engine and widgets, potentially - with the plan to break each of them up further into smaller pieces in the future. I don't think it will be too hard to maintain (temporary) backward compatibility during this evolution.

@ggetz
Copy link
Contributor Author

ggetz commented Sep 12, 2022

Thanks @mramato and @kring!

I agree that switching to a @cesium namespace would be a good call as part of this work.

To publish with the @cesium scope, we would need a new cesium npm organization, which should be free as long as all packages are public.

I wonder if we have actually done any small prototypes (either hacked up Cesium or a small POC complete separate from Cesium) of how this might work. I feel like there could be a lot of potential showstoppers (like JSDoc, TS generation)

@sanjeetsuhag Is currently working on this and we'll update here with results.

nosleep

Do we even need this? Cesium's VR support is ancient and I believe that's the only place this is used.

Yes, this is only used in the existing VR support. We'll consider removing if VR is not getting much usage.

@sanjeetsuhag
Copy link
Contributor

sanjeetsuhag commented Sep 12, 2022

I am currently prototyping on the workspaces branch. I initially tested things on a simple npm package I created, now I'm trying to test things on the cesium repo.

At the moment, I have just broken up the Source folder out into engine and widgets packages, and used the existing tooling to build index.js and index.d.ts for each (although the widgets is slightly broken). For the top level Cesium.js, the gulpfile.cjs is broken, but I expect that will be easy to fix up.

Here's a TODO list of things to test before we can begin to shape this into a PR:

  • Hooking up esbuild
  • Splitting up Specs
  • Get build-docs running
  • Refactoring gulpfile.cjs and build.cjs
  • Clean up Apps

@javagl
Copy link
Contributor

javagl commented Sep 13, 2022

There hasn't been sooo much feedback on the forum thread. But there's one point that I already mentioned in the forum, and that I wanted to bring up here, in view of the questions about "public vs private APIs" and the "long-ish list of libraries in cesium-native":

Will there be an attempt to align the package structure between CesiumJS and cesium-native?

It is phrased as a question, but from a very high-level perspective, there are good reasons to do that:

  • Discoverability and documentation. When people have invested a lot of time in learning a Cesium API, they'd probably appreciate it when they recognize certain structures in a different context. Maybe there could even be documentation pages that contain code snippets that can be toggled between the JavaScript/CesiumJS version or the C++/cesium-native version...?
  • When thinking in terms of "building blocks" and "APIs", one could suggestively ask: When it is warranted to have library "X" in cesium-native, what exactly is the reason to not have a library with the same scope in CesiumJS? This refers to granularity levels and blocks like "Math"/"Geometry"/"Geospatial"/"glTF". (Even though, of course, there are things like "CesiumAsync" or "CesiumGltfWriter" that are specific for C++).
  • It would allow to add clarity to the dependency graphs (Roughly: The dependency graphs should be "similar"...)

One should not be tooo strict here. And of course, this should not mean to blindly translaterate the API or class structures: The languages and environments are just too different, and in many cases, there will be judgement calls about the exact contents, scope and structure of a library. But for example, when it comes to questions like "Does 'BoundingRegion' belong into 'Geometry' or 'Geospatial'?", one could have a look at cesium-native and see whether it's possible to align both environments on the conceptual level.

@sanjeetsuhag sanjeetsuhag mentioned this issue Sep 27, 2022
15 tasks
@ggetz ggetz moved this from In Progress to Notable backlog items in CesiumJS Issue/PR backlog Dec 7, 2022
@ggetz ggetz moved this from Notable backlog items to Next priority in CesiumJS Issue/PR backlog Dec 7, 2022
@ggetz ggetz moved this from Next priority to Notable backlog items in CesiumJS Issue/PR backlog Apr 12, 2023
@ggetz
Copy link
Contributor Author

ggetz commented Jun 1, 2023

Request for a more minimal lighter-weight distribution requested in #11323

@jfayot
Copy link
Contributor

jfayot commented Mar 5, 2025

Hi cesium team!

I have submitted a PR in gltf-pipeline repo in order to move forward on smaller packages and better dependency management.

Is the idea of creating a @cesium/core package still something you're willing to do? If so, what would be the strategy here?

CC @ggetz, @jjspace, @javagl

@javagl
Copy link
Contributor

javagl commented Mar 5, 2025

I cannot speak about a specific strategy or timeline, but am strongly in favor of trying to proceed with that.

The way how gltf-pipeline is integrated always bugged me (some reasons recently summarized in #9614 (comment) ). So it's not really a "strategy", but ... maybe self evident... when I say that one step could be to try and figure out what exactly gltf-pipeline is using from Cesium that prevents Cesium using gltf-pipeline as a clean dependency (as there seems to be some circular dependency somewhere, and to break this, some "building block" has to be carved out). I'll still have to read the PR that you linked to more closely.

Maybe related: There is a branch that extracts... some form of "core" module, at #8359 (comment) , but similarly, I have not (yet) looked at the details.

@jfayot
Copy link
Contributor

jfayot commented Mar 6, 2025

@javagl , concerning the features used by gltf-pipeline from Cesium, here's the complete list:

  • defined
  • defaultValue
  • WebGLConstants
  • ComponentDatatype
  • Check
  • getMagic
  • getStringFromTypedArray
  • RuntimeError
  • Cartesian3
  • Cartesian4
  • clone
  • Matrix4
  • Quaternion

All of which are from the Core part of @cesium/engine.

So strictly speaking, detaching a @cesium/core component would be enough to break this circular dependency.

graph TD;
    gb[["`**bin**: gltf-pipeline`"]]
    gl[@gltf-pipeline/lib]
    gp[["`**lib**: gltf-pipeline`"]]
    gc[@gltf-pipeline/core]
    c[[cesium]]
    ce[@cesium/engine]
    cc[@cesium/core]
    cw[@cesium/widgets]

    c-->ce;
    c-->cw;
    ce-->gc;
    ce-->cc;
    cw-->ce;
    cw-->cc;
    gb-->gl;
    gp-->gl;
    gp-->gc;
    gc-->cc;
Loading

My main concern is rather if we should split deeper this core component (eg, with a @cesium/math) or should it be done in a next step ?

graph TD;
    c[cesium]
    ce[@cesium/engine]
    cc[@cesium/core]
    cm[@cesium/math]
    cw[@cesium/widgets]

    c-->ce;
    c-->cw;
    ce-->cc;
    cw-->cc;
    cw-->ce;
    ce-->cm;
    cc-->cm;
Loading

@javagl
Copy link
Contributor

javagl commented Mar 6, 2025

At one point, CesiumJS was broken into widgets and engine, with the intention of splitting the latter further. But I'm not aware of any specific further steps (or even plans) for that.

As mentioned in the forum thread: I think that eventually, there should be a structural similarity between the packages of CesiumJS and cesium-native. Not just for the sake of it (even though "being familiar with a certain package structure" can be a huge head start for someone who already used one library and now wants to use the other one). But also because every deviation would raise technical questions about the justifications for a certain structure.

For example: When there's a class like CartogaphicRectangle that is once in a package geometry and once in a package rendering, then I'd ask "Why?". It should probably be in a package geospatial, though. And ... that's the interesting question: What will be the exact package structure? While the existing structure of subdirectories in CesiumJS may already give a hint (and may have contributed to easier separability), it is not obvious what the final structure should be.

So in general, I assume that the only practical way of creating these packages is to do this incrementally. Packages like core and math have been mentioned as further splits of engine. But to my understanding, this would mean

  • carve out a clean math package that is clearly defined
  • use core for ... all that other stuff ...

The "clean" packages should then contain only the things of which we are absolutely and 100% certain that they will never be removed (and preferably not even changed). The mantra from one of my favorite talks about API design that is applicable here is:

When in doubt, leave it out!

You can always add things later (in a non-breaking way), but you can never remove or change something without breaking clients. And for the math package, that should be relatively easy. It will contain Cartesian3 and Matrix4, and these are very unlikely to change.


More specifically about gltf-pipeline:

Looking at the list, I think that such a math package should mostly be sufficient here:

  • defined, defaultValue, Check: Most of them are trivial, internal, standalone one-liners (and some about to be replaced)
  • WebGLConstants, ComponentDatatype: Yeah, the GL constants. This could be in some cesium/rendering package, but since this is not part of the public API, I don't see a problem with duplicating them (these will definitely never change, so the usual reasoning for avoiding duplication doesn't apply here)
  • getMagic, getStringFromTypedArray: I think these are pretty simple, no need to pull in a full dependency for that
  • clone: Maybe that can nowadays be replaced with structuredClone...?
  • RuntimeError: That should have been GltfPipelineError to begin with!!! 😆

What's left is what clearly belongs into math.


And more generically again:

One of the main challenges for breaking CesiumJS into smaller packages is that the dependencies are not cleanly modelled. There are functions that have 800 lines of code and do 10 different things. So there could be a single function that requires access to maybe 8 different "packages", if these could be clearly defined to begin with. Even worse: There are single lines that go.through._some.chain.of._private.properties.to.set._some.value = 123 that technically already touch several different classes.

So one question is: How did you determine the "List of things from CesiumJS that are used in gltf-pipeline"? (I think I also collected this list once, and I know that in this case, it's pretty easy to do this manually. In doubt, just remove the CesiumJS dependency and see where the compiler complains 😁 But I'm wondering if there is any form of tool that could help to automate that...)

@jfayot
Copy link
Contributor

jfayot commented Mar 6, 2025

That's a long answer @javagl 😆 !

How did you determine the "List of things from CesiumJS that are used in gltf-pipeline"?

Simply npm run build-cesium and then searched for "../../Core" in dist/cesium files

@jfayot
Copy link
Contributor

jfayot commented Mar 6, 2025

Ok so let me try to summarize your thoughts @javagl :

  1. In gltf-pipeline, eliminate as many dependencies as possible:
    a. by duplicating some trivial functions like defined, Check... (deliberately omitting defaultValue --> ??)
    b. by replacing some imported class like RuntimeError by appropriate local implementations
    c. by replacing candidate functions by javascript standards (like clone)
    ...
  2. Should only remain math functions on one side and webgl constants on the other side.
  3. Go through an iterative process, that progressively detach some well identified features from @cesium/engine.
  4. When @cesium/math and @cesium/rendering are out, move gltf-pipeline to depend on those.

Am I getting it right?

You can always add things later (in a non-breaking way), but you can never remove or change something without breaking clients. And for the math package, that should be relatively easy. It will contain Cartesian3 and Matrix4, and these are very unlikely to change.

This is not so easy in my understanding. Taking the math example: as you move math functions from @cesium/engine to @cesium/math, you're only adding features to @cesium/math, so no breaking change here. But at one point, you'll have to deprecate those functions from @cesium/engine, so whatever you do, you'll always introduce breaking changes somewhere...

@javagl
Copy link
Contributor

javagl commented Mar 6, 2025

Am I getting it right?

That's what I'd consider a reasonable approach, but ... let's give other a chance to chime in here.

But at one point, you'll have to deprecate those functions from @cesium/engine, so whatever you do, you'll always introduce breaking changes somewhere...

Yes, there will be breaking changes, on many levels (and the exact deprecation process has to be sorted out). But one important goal (that I could only describe somewhat abstractly) is about stabilities in the resulting dependency hierarchy. There are areas in the code where new features are added frequently (think about rendering glTF files and adding new extensions as they are ratified). There are areas in the code that will basically never change (e.g. math stuff like Cartesian3). And the dependencies should preferably always be in the direction of stability: Things that change should depend on things that are stable. (The opposite is not even possible, in some way: One cannot create something stable based on something that changes frequently. Shoutout to all the Web Frontend developers out there ... 😬 )

@ggetz
Copy link
Contributor Author

ggetz commented Mar 6, 2025

by duplicating some trivial functions like defined, Check... (deliberately omitting defaultValue --> ??)

I think the brings up a big topic that will need to be considered here: What feature set can these lower-level packages depend on? Are they required to run in both the browser and in NodeJS? Should they target a lowest common denominator for browsers, or can we assume they will be bundled and transpiled?

@javagl
Copy link
Contributor

javagl commented Mar 6, 2025

I don't have the full picture of what exactly is working where. Only roughly: Do something with document? Yeah, not in NodeJS. Reading a file with fs.readFileSync? Well, how should a browser know what a file is.

But I think that for many packages (like math, geometry, geospatial, or whatever it will be) the question should not come up. In others, the answer may be evident (widgets or rendering), and in the remaining ones, one could look more closely, on a case-by-case basis: What should be contained in this package, and would that make it incompatible with either NodeJS or the Browser?

Maybe this also helps to raise awareness. Maybe it's possible to introduce some abstractions (like WebIO and NodeIO in glTF-Transform). And maybe it's possible to keep the main part of the package compatible with both, and add some tiny add-on for the Platform-specific part - like gltf-pipeline, with the suggested core the tiny cli, with the latter just containing the file/command-line handling.

@ggetz
Copy link
Contributor Author

ggetz commented Mar 10, 2025

@jfayot @javagl Specifically in regards to gltf-pipeline, we have begun some initial conversations in CesiumGS/gltf-pipeline#670, the results of which will determine the plan going forward for how we deal with the cyclical dependencies between that and CesiumJS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Notable backlog items
Development

No branches or pull requests

8 participants