I now need to solve an approximate solution to a linear equation system using non-negative least squares(NNLS). The input is a matrix (N rows and M columns) and a column vector (N elements), and the expected output is a vector (M elements).
Currently, I am using the nnls
interface from the scipy.optimize
module in Python , but the data size is too large (N=100K, M=7K), resulting in excessively long computation times (10h, single-threaded). Therefore, I would like to use a more efficient C/C++ method or a multi-threaded approach (if possible), and I need to ensure that the compiled binary has as few dependencies as possible (so that it can run directly on a new machine with the same instruction set compatibility).
Which libraries can meet this requirement?