4
4
5
5
6
6
class SNMFOptimizer :
7
+ """A self-contained implementation of the stretched NMF algorithm (sNMF),
8
+ including sparse stretched NMF.
9
+
10
+ Instantiating the SNMFOptimizer class runs all the analysis immediately.
11
+ The results matrices can then be accessed as instance attributes
12
+ of the class (X, Y, and A).
13
+
14
+ For more information on sNMF, please reference:
15
+ Gu, R., Rakita, Y., Lan, L. et al. Stretched non-negative matrix factorization.
16
+ npj Comput Mater 10, 193 (2024). https://doi.org/10.1038/s41524-024-01377-5
17
+ """
18
+
7
19
def __init__ (
8
20
self ,
9
21
MM ,
@@ -17,48 +29,33 @@ def __init__(
17
29
n_components = None ,
18
30
random_state = None ,
19
31
):
20
- """Run sNMF based on an ndarray, parameters, and either a number
21
- of components or a set of initial guess matrices.
22
-
23
- Currently instantiating the SNMFOptimizer class runs all the analysis
24
- immediately. The results can then be accessed as instance attributes
25
- of the class (X, Y, and A). Eventually, this will be changed such
26
- that __init__ only prepares for the optimization, which will can then
27
- be done using fit_transform.
32
+ """Initialize an instance of SNMF and run the optimization
28
33
29
34
Parameters
30
35
----------
31
36
MM: ndarray
32
- A numpy array containing the data to be decomposed. Rows correspond
33
- to different samples/angles, while columns correspond to different
34
- conditions with different stretching. Currently, there is no option
35
- to treat the first column (commonly containing 2theta angles, sample
36
- index, etc) differently, so if present it must be stripped in advance.
37
+ The array containing the data to be decomposed. Shape is (length_of_signal,
38
+ number_of_conditions).
37
39
Y0: ndarray
38
- A numpy array containing initial guesses for the component weights
39
- at each stretching condition, with number of rows equal to the assumed
40
- number of components and number of columns equal to the number of
41
- conditions (same number of columns as MM). Must be provided if
42
- n_components is not provided. Will override n_components if both are
43
- provided.
40
+ The array containing initial guesses for the component weights
41
+ at each stretching condition. Shape is (number of components, number of
42
+ conditions) Must be provided if n_components is not provided. Will override
43
+ n_components if both are provided.
44
44
X0: ndarray
45
- A numpy array containing initial guesses for the intensities of each
46
- component per row/sample/angle. Has rows equal to the rows of MM and
47
- columns equal to n_components or the number of rows of Y0.
45
+ The array containing initial guesses for the intensities of each component per
46
+ row/sample/angle. Shape is (length_of_signal, number_of_components).
48
47
A: ndarray
49
- A numpy array containing initial guesses for the stretching factor for
50
- each component, at each condition. Has number of rows equal to n_components
51
- or the number of rows of Y0, and columns equal to the number of conditions
52
- (columns of MM).
48
+ The array containing initial guesses for the stretching factor for each component,
49
+ at each condition. Shape is (number_of_components, number_of_conditions).
53
50
rho: float
54
- A stretching factor that influences the decomposition. Zero corresponds to
55
- no stretching present. Relatively insensitive and typically adjusted in
56
- powers of 10.
51
+ The float which sets a stretching factor that influences the decomposition.
52
+ Zero corresponds to no stretching present. Relatively insensitive and typically
53
+ adjusted in powers of 10.
57
54
eta: float
58
- A sparsity factor than influences the decomposition. Should be set to zero
59
- for non sparse data such as PDF. Can be used to improve results for sparse
60
- data such as XRD, but due to instability, should be used only after first
61
- selecting the best value for rho.
55
+ The integer which sets a sparsity factor than influences the decomposition.
56
+ Should be set to zero for non sparse data such as PDF. Can be used to improve
57
+ results for sparse data such as XRD, but due to instability, should be used
58
+ only after first selecting the best value for rho.
62
59
max_iter: int
63
60
The maximum number of times to update each of A, X, and Y before stopping
64
61
the optimization.
@@ -71,10 +68,9 @@ def __init__(
71
68
be overridden by Y0 if that is provided, but must be provided if no Y0 is
72
69
provided.
73
70
random_state: int
74
- Used to set a reproducible seed for the initial matrices used in the
75
- optimization. Due to the non-convex nature of the problem, results may vary
76
- even with the same initial guesses, so this does not make the program
77
- deterministic.
71
+ The integer which acts as a reproducible seed for the initial matrices used in
72
+ the optimization. Due to the non-convex nature of the problem, results may vary
73
+ even with the same initial guesses, so this does not make the program deterministic.
78
74
"""
79
75
80
76
self .MM = MM
0 commit comments