You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
other code changes/additions -- please expand as needed, there will likely be some
decide on how to "split off" the library code and make it public while this repo remains private? preservation of git history vs. keeping research plans to ourselves for now, etc. There are various logistical annoyances related to that
Add an appropriate project license, considering what we import/link to, what FCI approved, and all that (may need to talk to FCI again if we change from original agreement, and to sort out the multiple sub repos for library vs. research code, etc.)
(Tyler) PyPI release process -- we may not need binaries if we don't have compiled code, but should at least update pyproject.toml to match modern standards for support metadata, etc.
conda-forge release process
(Tyler) Portability -- we should probably be conformant with SPEC 0 -- this likely boils down to supporting Python 3.12 - 3.14 and testing for those in the CI
Do an official/GitHub immutable release when the (likely separate?) library repo is public, and assign it a Zenodo DOI
Pick a journal -- we are NOT eligible for JOSS (https://joss.theoj.org/) because we don't have 6 months of sustained public engagement on i.e., GitHub
Draft the paper somewhere -- I can create a private Overleaf link that up to 10 people can edit -- I believe this should be "ok" since it would be similar level of privacy to this GitHub repo (can we loop Kyle/Kostas in a bit, or not for the code?) -- https://www.overleaf.com/3111751118mvxmcpqnqxmw#eb42c5
Get an LA-UR for the publication once we have a draft
Draft developer docs on how to add additional RVFL architectures in the future? Is that process clear?
(Emma) Remove StandardScaler usage from the estimators proper
what about the OneHotEncoder usage--double check that as well vs. requiring properly encoded input (I think one-hot requirement was more fundamental, but let's think about that)
Emma would like to hoist the weight initializers out of the classes - Emma
Emma would like to see some more input validation/error checking
Add a regression equivalent of EnsembleRVFLClassifier() - Emma
Allow for "online learning" via partial_fit since Grafo supports this, sklearnMLPClassifier supports it, and it may be necessary for training with very large design matrices - Emma
Follow up with Kostas regarding the iterative scheme we need to test (SGD vs GD) for comparison with exact solve and clarify the details of the numerical test - Navamita
Split off appropriate parts of the library code (which parts?) for incorporation into a fork/branch of scikit-learn, and iterate with the team using a forked repo of scikit-learn (try to add all 3 of us in the condensed/redone commit history)
In the scikit-learn fork, work with the team to draft the PR description we'll eventually use to propose our addition--this should be well-written markdown that includes of examples of relevant papers that have been well cited, a clear explanation of what RVFL is, and so on, to convince the scikit-learn developers that this work is of sufficiently broad interest to be useful at the base of the Python ML ecosystem
in the scikit-learn fork, adjust our docstrings to match the exact standards they use
in the scikit-learn fork, make sure we pass their CI with their full testsuite/linting requirements, etc.
decide on the name of our open source library project (rvfl may not suit if we also offer ELM, etc.?), then open source it under LANL org
@nray @eiviani-lanl just migrating this tracking issue to the public repo via copy/paste of current state.
numpydocin the CI #26)githistory vs. keeping research plans to ourselves for now, etc. There are various logistical annoyances related to thatpyproject.tomlto match modern standards for support metadata, etc.conda-forgerelease process3.12-3.14and testing for those in the CIStandardScalerusage from the estimators properOneHotEncoderusage--double check that as well vs. requiring properly encoded input (I think one-hot requirement was more fundamental, but let's think about that)EnsembleRVFLClassifier()- Emmapartial_fitsince Grafo supports this,sklearnMLPClassifiersupports it, and it may be necessary for training with very large design matrices - Emmascikit-learn, and iterate with the team using a forked repo ofscikit-learn(try to add all 3 of us in the condensed/redone commit history)scikit-learnfork, work with the team to draft the PR description we'll eventually use to propose our addition--this should be well-written markdown that includes of examples of relevant papers that have been well cited, a clear explanation of what RVFL is, and so on, to convince thescikit-learndevelopers that this work is of sufficiently broad interest to be useful at the base of the Python ML ecosystemscikit-learnfork, adjust our docstrings to match the exact standards they usescikit-learnfork, make sure we pass their CI with their full testsuite/linting requirements, etc.rvflmay not suit if we also offer ELM, etc.?), then open source it under LANL org