- "text": "Sampled dense-dense matrix product (SDDMM) is a bottleneck operation in many\nfactor analysis algorithms used in machine learning, including Alternating\nLeast Squares and Latent Dirichlet Allocation [1]. Mathematically, the\noperation can be expressed as \n\n\n\n\nA = B \\circ CD,\n\n\n\n\nwhere \nA\n and \nB\n are sparse matrices, \nC\n and \nD\n are dense matrices,\nand \n\\circ\n denotes component-wise multiplication. This operation can also be\nexpressed in \nindex\nnotation\n as \n\n\n\n\nA_{ij} = B_{ij} \\cdot C_{ik} \\cdot C_{kj}.\n\n\n\n\nYou can use the taco C++ library to easily and efficiently compute the SDDMM, as\nshown here:\n\n\n// On Linux and MacOS, you can compile and run this program like so:\n// g++ -std=c++11 -O3 -DNDEBUG -DTACO -I ../../include -L../../build/lib sddmm.cpp -o sddmm -ltaco\n// LD_LIBRARY_PATH=../../build/lib ./sddmm\n#include \nrandom\n\n#include \"taco.h\"\nusing namespace taco;\nint main(int argc, char* argv[]) {\n std::default_random_engine gen(0);\n std::uniform_real_distribution\ndouble\n unif(0.0, 1.0);\n // Predeclare the storage formats that the inputs and output will be stored as.\n // To define a format, you must specify whether each dimension is dense or sparse\n // and (optionally) the order in which dimensions should be stored. The formats\n // declared below correspond to doubly compressed sparse row (dcsr), row-major\n // dense (rm), and column-major dense (dm).\n Format dcsr({Sparse,Sparse});\n Format rm({Dense,Dense});\n Format cm({Dense,Dense}, {1,0});\n\n // Load a sparse matrix from file (stored in the Matrix Market format) and\n // store it as a doubly compressed sparse row matrix. Matrices correspond to\n // order-2 tensors in taco. The matrix in this example can be download from:\n // https://www.cise.ufl.edu/research/sparse/MM/Williams/webbase-1M.tar.gz\n Tensor\ndouble\n B = read(\"webbase-1M.mtx\", dcsr);\n // Generate a random dense matrix and store it in row-major (dense) format.\n Tensor\ndouble\n C({B.getDimension(0), 1000}, rm);\n for (int i = 0; i \n C.getDimension(0); ++i) {\n for (int j = 0; j \n C.getDimension(1); ++j) {\n C.insert({i,j}, unif(gen));\n }\n }\n C.pack();\n\n // Generate another random dense matrix and store it in column-major format.\n Tensor\ndouble\n D({1000, B.getDimension(1)}, cm);\n for (int i = 0; i \n D.getDimension(0); ++i) {\n for (int j = 0; j \n D.getDimension(1); ++j) {\n D.insert({i,j}, unif(gen));\n }\n }\n D.pack();\n\n // Declare the output matrix to be a sparse matrix with the same dimensions as\n // input matrix B, to be also stored as a doubly compressed sparse row matrix.\n Tensor\ndouble\n A(B.getDimensions(), dcsr);\n\n // Define the SDDMM computation using index notation.\n IndexVar i, j, k;\n A(i,j) = B(i,j) * C(i,k) * D(k,j);\n\n // At this point, we have defined how entries in the output matrix should be\n // computed from entries in the input matrices but have not actually performed\n // the computation yet. To do so, we must first tell taco to generate code that\n // can be executed to compute the SDDMM operation.\n A.compile();\n // We can now call the functions taco generated to assemble the indices of the\n // output matrix and then actually compute the SDDMM.\n A.assemble();\n A.compute();\n // Write the output of the computation to file (stored in the Matrix Market format).\n write(\"A.mtx\", A);\n}\n\n\n\nYou can also use the TACO Python library to perform the same computation, as\ndemonstrated here:\n\n\nimport pytaco as pt\nfrom pytaco import dense, compressed\nimport numpy as np\n\n# Define formats that the inputs and output will be stored as. To define a\n# format, you must specify whether each dimension is dense or sparse and\n# (optionally) the order in which dimensions should be stored. The formats\n# declared below correspond to doubly compressed sparse row (dcsr), row-major\n# dense (rm), and column-major dense (dm).\ndcsr = pt.format([compressed, compressed])\nrm = pt.format([dense, dense])\ncm = pt.format([dense, dense], [1, 0])\n\n# The matrix in this example can be download from:\n# https://www.cise.ufl.edu/research/sparse/MM/Williams/webbase-1M.tar.gz\nB = pt.read(\"webbase-1M.mtx\", dcsr)\n\n# Generate two random matrices using NumPy and pass them into TACO\nx = pt.from_array(np.random.uniform(size=(B.shape[0], 1000)))\nz = pt.from_array(np.random.uniform(size=(1000, B.shape[1])), out_format=cm)\n\n# Declare the result to be a doubly compressed sparse row matrix\nA = pt.tensor(B.shape, dcsr)\n\n# Declare index vars\ni, j, k = pt.get_index_vars(3)\n\n# Define the SDDMM computation\nA[i, j] = B[i, j] * C[i, k] * D[k, j]\n\n# Perform the SDDMM computation and write the result to file\npt.write(\"A.mtx\", A)\n\n\n\nWhen you run the above Python program, TACO will generate code under the hood\nthat efficiently performs the computation in one shot. This lets TACO only \ncompute elements of the intermediate dense matrix product that are actually \nneeded to compute the result, thus reducing the asymptotic complexity of the \ncomputation.\n\n\n[1] Huasha Zhao. 2014. High Performance Machine Learning through Codesign and\nRooflining. Ph.D. Dissertation. EECS Department, University of California,\nBerkeley.",
0 commit comments