Chalk Vision #15
Pinned
viega
announced in
Announcements
Replies: 2 comments 2 replies
-
Each one of these objectives now has its own ticket. But I've temporarily locked those; any feedback can come to this thread for now, or brought to us privately if you prefer. Definitely don't want to spread any conversation across multiple places. |
Beta Was this translation helpful? Give feedback.
1 reply
-
Note that we are using the in-toto standard for signing today, and are fans.
Definitely could collect any other signatures found too… good suggestion!
…On Mon, Oct 2, 2023 at 10:52 PM Trishank Karthik Kuppusamy < ***@***.***> wrote:
Excited to try Chalk, and see where it goes! One thing I might add to Broad
Objective 1: More metadata collection is support for de facto standards
such as in-toto attestations <https://github.com/in-toto/attestation>.
Would be really nice to collect metadata in a shared way.
Cc fellow in-toto Steering Committee members @SantiagoTorres
<https://github.com/SantiagoTorres> @JustinCappos
<https://github.com/JustinCappos> @colek42 <https://github.com/colek42>
@06kellyjac <https://github.com/06kellyjac>
—
Reply to this email directly, view it on GitHub
<#15 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABELGQKDUUU23TYDNVYSXHLX5N4V3AVCNFSM6AAAAAA5PQMX7OVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TCNZRGEYDG>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Chalk Vision
Introduction
After launching Chalk last week, we needed to pause for a bit to tie up some loose ends, catch our breath, and turn toward planning. We’re still going to take a few days to come up with our own short-term priorities.
But in the mean time, we wanted to communicate more about where we’re planning to take Chalk… eventually. Our goal here is to help people understand better the principles we value and are building toward, and to provide a basis for conversation, anyone who wants to make any sort of contribution, whether it be code or just ideas and discourse.
There's far more implied here than we'll get to quickly. We probably won't even get detailed issues for items into the system up front; we'll start with things we're considering for short-term work. But if you are interested in one of these areas, do let us know, as we're always looking for more data from the user base.
Nothing in this document should be taken to indicate relative priority or timeline. We’re not working on all of this stuff right now. None of these categories are inherently prioritized in any way relative to each other.
We’ll definitely be communicating soon about shorter term goals too; just expect those priorities to map into these broad objectives.
Overview
If I were to sum up our guiding design principles here, it's would be:
Plus, we obviously want to have the comprehensive functionality people would put to use. So we want to be great at metadata, collection and reporting in all environments.
As a result, you'll find below that we've grouped everything into seven broad objectives:
We hope we're doing well already toward all of these objectives, but there's plenty more we would like to do!
Broad Objective 1: More metadata collection
With our focus on trying to tie together data through an app's lifecycle, we certainly want to be on a never-ending treadmill to help collect data from important technologies. And, we want to make it easy to people to do the integrations they need for their own uses (and ideally contribute them back to the world).
In production, for instance, while we do have a reasonable start for local collection and AWS metadata collection, but there's more to do to there.
And, there is plenty we aren't doing at all, yet, such as cloud metadata for the other major public clouds. We don't do integration with k8s or cloud provider container runtimes.
Not to mention, we haven't paid enough attention to what other tools we should be integrating with in CI/CD for information. For instance, deep integrations w/ package management ecosystems seems likely to lead to some large benefits for users on a lot of fronts.
Additionally, we need to do a little more to facilitate people adding to our data collection facilities if they don't know Nim-- the API is designed so that you could link in plugins, but we would need a little bit more plumbing to actually leverage external plugins.
Additionally, there is metadata people would like to use Chalk to orchestrate collecting during CI/CD that we wouldn't support today, because it would slow build times too much. We should consider a general ability to support orchestration to collect out-of-band collection of metadata.
For instance, with a lot of security analysis tools some companies want to run on every build, like static analysis, we might want to support this via Chalk to facilitate the goal of tying everything together in as simple a way as possible.
Broad Objective 2: Very flexible runtime data collection
In our quest for better observability of software, Chalk currently allows very lightweight, limited data collection in production, at process startup and at a regular interval. However, there are some challenges there:
Of course, in production infrastructure, performance and cost is tantamount for all the items above. "Do no harm" should definitely be our mantra, with good controls to give people confidence on those issues.
For the above, do note that we've been working on 5) above as a separate project for quite a while (longer than Chalk), which is pretty far along. We need to make sure everything comes together well, there.
Broad Objective 3: Easy Deployment Everywhere
We definitely strive for friction-free deployment wherever possible; that goal led to our approach for handling docker builds, and our configuration strategy, there is a lot more that should be done, especially in making Chalk work as well as possible with a wide variety of technology stacks, not just docker builds.
There are multiple dimensions to this-- we should make it trivially easy to deploy across organizations using built in CI/CD tooling at Github/Gitlab, not to mention other popular CI/CD tools.
But we should also be thinking about how to make it easy to deploy in environments leveraging composition tools like docker bake or docker compose. Currently, without us integrating with those pieces, effectively deploying probably harder, because it would take more work than should be needed to tie data together.
Similarly, we aren't doing anything to support other container runtimes, which complicates deployments using them. We also don't cover all cases for the docker deployment we do cover, as a few scenarios will disable container wrapping (like multi-arch builds, which is pretty important in our view) and some unsupported features (like Heredocs) will cause us to fall back to a regular invocation of Docker.
Broad Objective 4: Better Platform Coverage
Today, we've focused on the platforms that are most common to the pain points we've heard in our initial investigations, namely production systems that are predominantly Linux based backend workloads, running on amd64 or arm64 platforms. We've also found a "quick and dirty" way to support MacOS, but more intended to facilitate development, since many of us use the OS.
Certainly, we know from the enthusiastic responses and asks we've gotten as we've previewed Chalk, that people would deeply benefit from having the same kind of observability in plenty of other environments. That includes both basic marking (and extraction) on other platforms, but also includes the ability to collect environmental data at runtime, where doing so performantly and safely is feasible.
For instance:
One challenge in some of these scenarios is that the architecture pretty deeply embeds a "posix" assumption. Perhaps that's not too bad, as Windows is the big outlier these days, and WSL2 is probably is a fairly achievable target without an ungainly amount of work.
For this objective, we currently have a preference for production usage, over environments that are more user-centric.
Broad Objective 5: Strong Documentation
We've definitely written a lot of documentation so far, and think it's important for docs to be easy to find and understandable. We also think there need to be both sufficient reference-style docs, and sufficient task-oriented docs (tutorials and how-tos).
And, we want to minimize outdated documentation. In the code, we have tried to make that task more manageable for user docs by keeping docs inline next to the config options and flags.
Unfortunately, as I write this, we're completely lacking in developer documentation. Yet, Chalk is actually a fairly sizable project, to the point where code plus comments is nowhere near enough to help make it easy for people to contribute.
The people who contribute will undoubtedly want to understand the "lay of the land" so to speak, meaning that they'll want to have a reasonable understanding of the architecture, and how they would plug into it.
Some reference documentation would be valuable, but it will be more important for people to have task-oriented how-to guides to tackle the enhancements they're most likely to want to make.
At the very least, we should probably have clear guidance on:
All of the above will need to be supported by an overview on how to work with the config files.
Broad Objective 6: Command line usability
For configuration
Ideally, we wouldn't want to directly expose our configuration language to anyone except power users (the config language is not-so-secret sauce for making the uncommon things easy).
And we've done a lot of work toward hiding it. For instance, the
chalk load
command can take a URL. But, off-the-shelf configs need to be tweaked... people need parameters. And they shouldn't have to drop into a text editor to provide them.People should be able to configure based on the problem they're going to solve. If they want to tackle something like SBOM implementation, they shouldn't have to navigate a lot of configuration that's irrelevant to the problem. It should be easy to find the configuration UI they need, whatever it is.
Any UI we provide here needs to be not only incredibly easy to use, it needs to be easy to keep in sync w/ any changes to chalk over time.
Eventually, 'wizard'-like interfaces or other similar things might be great, but just being able to reliably configure stuff easily from the command line would need to come first!
For help
We've worked hard to make sure we can provide consistent help in-command and on the web. But the UX for the in-app help did NOT get finished.
The framework is all there to make help easy to search, and easy to browse individual help items. The help's all available. However, we have not finished the presentation layer (everything looks ugly with lots of formatting problems, and while full text search works, it can produce an unmanageable mess of results at the moment).
Beyond that, we should also be considering how much value people would get if we had better "navigation" primitives, both for the help system, and possibly for operating the command in general (e.g., some basic TUI options, probably via wrapping notcurses).
Other commands
We definitely want to improve the user experience for any common command-line operations. For instance, when using the chalk extract command to view data about artifacts in real time. Based on early feedback, we learned most of the time it's better to only show small summary reports for command line operation; but when people do want them, grepping a log and piping it to
jq
isn't a great, easy to use experience for me personally.We could imagine a
chalk log
command that focuses on just removing the cognitive burden, a full TUI interface, both, or none of the above, based on feedback.Broad Objective 7: Healthy Core
We're committed to making sure our core enables developers to support our functional and non-functional requirements, in a way that is pretty easy for them.
We have built the Chalk core targeting it being:
Still, there are many ways in which we can be advancing the above goals. First, key to giving us the flexibility / power while helping make it easy to deliver on usability, is our underlying configuration language
con4m
.While
con4m
is more than good enough for our initial Chalk release, it is nowhere near what we would consider good enough for everyone else, as it mainly got enough attention to "jump the bar" for where Chalk is today, and no more.We're not in a hurry to get
con4m
ready for projects outside our own, but we are committed to keeping it in good shape forchalk
, and to design changes in a way that could be used by other projects from day one.There are lots of warts we should smooth over (many of them intentional choices due to timeline), like syntax that should work better. And we focused on robustness over performance, since performance was clearly good enough for an initial release.
Just in the configuration system, there are plenty of big wins that wouldn't take too much work. For instance, most of the embedded configuration can be pre-computed with a little bit of refactoring, which would eliminate most of an (already small) startup cost.
But outside of the configuration, there are plenty such items.
For instance, there are places that, due to expediency, we used dependencies that weren't quite ideal for the use case. For instance, when doing signing of containers and other artifacts, we are currently dependent on Sigstore's
cosign
, which is a great tool, but is itself large and may need to be downloaded often, depending on the deployment, which isn't great when build times are crucial. Yet! Most of the core primitives to implement what we need from Sigstore are already available to us via the underlying cryptography library. We just need to implement enough pieces on top of it to provide what we need from Sigstore without being dependent on another library.Another example where our (few) outside dependencies aren't good enough is with our Zip file handling. It currently isn't easy to do what we need to do without heavy use of tmp files, which is an issue for disk usage, among other things. We could bring in a different dependency to balance our needs better, some work required, of course.
The Zip issue overlaps with another potential concern that can impact robustness... memory footprint. In designing the initial release, we didn't worry too much about it-- virtual memory is pretty effective! But if we want to do better for more constrained environments, we can do a lot better here (for instance, even ensuring the large embedded documentation corpus lives in static memory would be a decent win if memory becomes an issue).
Conclusion
Sure, we're excited about what we've built, and we love the use and the feedback we're already getting. But, we've got a lot we want to do here. We're planning to be actively developing Chalk for a very long time.
We're going to focus on the above vision. Today, we'll be populating the issues with the seven objectives above, and as we start working on things, we'll have any feature tickets link back to those objectives.
We'd certainly love contributions, but feedback and discussion about what Chalk does, what it could do, and what it should do, is also tremendously valuable to help us prioritize and evolve our goals.
So please, don't hesitate to reach out to us to discuss!
Beta Was this translation helpful? Give feedback.
All reactions