Project Name  Stars  Downloads  Repos Using This  Packages Using This  Most Recent Commit  Total Releases  Latest Release  Open Issues  License  Language 

Cglm  1,683  11 days ago  8  November 19, 2020  44  mit  C  
📽 Highly Optimized Graphics Math (glm) for C  
Numbers.js  1,641  81  18  5 years ago  6  July 12, 2018  9  apache2.0  JavaScript  
Advanced Mathematics Library for Node.js and JavaScript  
Pycm  1,364  5  8  a month ago  39  April 27, 2022  12  mit  Python  
Multiclass confusion matrix library in Python  
Phpmatrix  1,273  654  14  6 months ago  18  July 01, 2021  3  mit  PHP  
PHP Class for handling Matrices  
Joml  653  208  22  4 days ago  46  February 11, 2022  6  mit  Java  
A Java math library for OpenGL rendering calculations  
Spectra  601  6 months ago  39  mpl2.0  C++  
A headeronly C++ library for large scale eigenvalue problems  
Ejml  493  5  17  a month ago  11  November 05, 2020  23  apache2.0  Java  
A fast and easy to use linear algebra library written in Java for dense, sparse, real, and complex matrices.  
Matrex  374  2  1  3 years ago  18  October 08, 2019  4  bsd3clause  Elixir  
A blazing fast matrix library for Elixir/Erlang with C implementation using CBLAS.  
Md_parola  348  6 months ago  lgpl2.1  C++  
Library for modular scrolling LED matrix text displays  
Matrix Toolkits Java  334  199  32  6 years ago  6  July 05, 2015  7  lgpl3.0  Java  
:rocket: High Performance Linear Algebra OOP 
float is a single precision (aka float) matrix framework for R. Base R has no single precision type. Its "numeric" vectors/matrices are double precision (or possibly integer, but you know what I mean). Floats have half the precision of double precision data, for a pretty obvious performance vs accuracy tradeoff.
A matrix of floats should use about half as much memory as a matrix of doubles, and your favorite matrix routines will generally compute about twice as fast on them as well. However, the results will not be as accurate, and are much more prone to roundoff error/mass cancellation issues. Statisticians have a habit of overhyping the dangers of roundoff error in this author's opinion. If your data is wellconditioned, then using floats is "probably" fine for many applications.
⚠️ WARNING ⚠️ type promotion always defaults to the higher precision. So if a float matrix operates with an integer matrix, the integer matrix will be cast to a float first. Likewise if a float matrix operates with a double matrix, the float will be cast to a double first. Similarly, any float matrix that is explicitly converted to a "regular" matrix will be stored in double precision.
The package requires the single precision BLAS/LAPACK routines which are not included in the default libRblas
and libRlapack
shipped from CRAN. If your BLAS/LAPACK libraries do not have what is needed, then they will be built (note that a fortran compiler is required in this case). However, these can take a very long time to compile, and will have much worse performance than optimized libraries. The topic of which BLAS/LAPACK to use and how to use them has been written about many times. If this is the first you're hearing of it, I would recommend you use Microsoft R Open.
To install the R package, run:
install.packages("float")
The development version is maintained on GitHub:
remotes::install_github("wrathematics/float")
If you are installing on Windows and wish to get the best performance, then you will need to install from source after editing some files. After installing highperformance BLAS and LAPACK libraries, delete the text $(LAPACK_OBJS)
from line in src/Makevars.win
beginning with OBJECTS =
. You will also need to add the appropriate link line. This will ensure that on building, the package links with your highperformance libraries instead of compiling the reference versions. This is especially important for 32bit Windows where the internal LAPACK and BLAS libraries are built without compiler optimization because of a compiler bug.
Also, if you are using Windows on big endian hardware (I'm not even sure if this is possible), then you will need to change the 0 in src/windows/endianness.h
to a 1. Failure to do so will cause very bizarre things to happen with the NA handlers.
Before we get to the main usage of the package and its methods,
as.float()
(or its shorthand fl()
).as.double()
or as.integer()
(or their shorthands, dbl()
and int()
).integer(5)
), use float()
.float32()
.R has a generic number type "numeric" which encompasses integers and doubles. The function is.numeric()
will FALSE
for float vectors/matries. Similarly, as.numeric()
will return the data cast as double.
The goal of the package is to recreate the matrix algebra facilities of the base package, but with floats. So we do not include higher statistical methods (like lm()
and prcomp()
).
Is something missing? Please let me know.
Method  Status 

[ 
done 
c() 
done 
cbind() and rbind()

done 
diag() 
done 
is.na() 
done 
is.float() 
done 
min() and max()

done 
na.omit() , na.exclude()

done 
nrow() , ncol() , dim()

done 
object.size() 
done 
print() 
done 
rep() 
done 
scale() 
Available for logical center and scale

str() 
done 
sweep() 
Available for FUN 's "+" , "" , "*" , and "/" . Others impossible(?) 
typeof() and storage.mode()

No storage.mode< method. 
which.min() and which.max()

done 
Method  Status 

+ 
done 
* 
done 
 
done 
/ 
done 
^ 
done 
> 
done 
>= 
done 
== 
done 
< 
done 
<= 
done 
Method  Status 

dbl() 
done 
int() 
done 
fl() 
done 
as.vector() and as.matrix()

done 
Method  Status 

%*% 
done 
backsolve() and forwardsolve()

done 
chol() , chol2inv()

done 
crossprod() and tcrossprod()

done 
eigen() 
only for symmetric inputs 
isSymmetric() 
done 
La.svd() and svd()

done 
norm() 
done 
qr() , qr.Q() , qr.R()

done 
rcond() 
done 
solve() 
done 
t() 
done 
Method  Status 

abs() , sqrt()

done 
ceiling() , floor() , trunc() , round()

done 
exp() , exp1m()

done 
gamma() , lgamma()

done 
is.finite() , is.infinite() , is.nan()

done 
log() , log10() , log2()

done 
sin() , cos() , tan() , asin() , acos() , atan()

done 
sinh() , cosh() , tanh() , asinh() , acosh() , atanh()

done 
Method  Status 

.Machine_float 
float analogue of .Machine . everything you'd actually want is there 
Method  Status 

colMeans() 
done 
colSums() 
done 
rowMeans() 
done 
rowSums() 
done 
sum() 
done 
Memory consumption is roughly half when using floats:
library(float)
m = 10000
n = 2500
memuse::howbig(m, n)
## 190.735 MiB
x = matrix(rnorm(m*n), m, n)
object.size(x)
## 200000200 bytes
s = fl(x)
object.size(s)
## 100000784 bytes
And the runtime performance is (generally) roughly 2x better:
library(rbenchmark)
cols < cols < c("test", "replications", "elapsed", "relative")
reps < 5
benchmark(crossprod(x), crossprod(s), replications=reps, columns=cols)
## test replications elapsed relative
## 2 crossprod(s) 5 3.185 1.000
## 1 crossprod(x) 5 7.163 2.249
However, the accuracy is better in the double precision version:
cpx = crossprod(x)
cps = crossprod(s)
all.equal(cpx, dbl(cps))
## [1] "Mean relative difference: 3.478718e07"
For this particular example, the difference is fairly small; but for some operations/data, the difference could be significantly larger due to roundoff error.
Because of the use of S4 for the nice syntax, there is some memory overhead which is noticeable for small vectors/matrices. This cost is amortized quickly for reasonably large vectors/matrices. But storing many very small float vectors/matrices can be surprisingly costly.
For example, consider the cost for a single float vector vs a double precision vector:
object.size(fl(1))
## 632 bytes
object.size(double(1))
## 48 bytes
However once we get to 147 elements, the storage is identical:
object.size(fl(1:147))
## 1216 bytes
object.size(double(147))
## 1216 bytes
And for vectors/matrices with many elements, the size of the double precision data is roughly twice that of the float data:
object.size(fl(1:10000))
## 40624 bytes
object.size(double(10000))
## 80040 bytes
The above analysis assumes that your float
and double
values are conforming to the IEEE754 standard (which is required to build this package). It specifies that a float
requires 4 bytes, and a double
requires 8. The size of an int
is actually system dependent, but is probably 4 bytes. This means that for most, a float matrix should always be larger than a similarly sized integer matrix, because the overhead for our float matrix is simply larger. However, for objects with many elements, the sizes will be roughly equal:
object.size(fl(1:10000))
## 40624 bytes
object.size(1:10000)
## 40040 bytes
It's (generally) twice as fast and uses half the RAM compared to double precision. For a some data analysis tasks, that's more important than having (roughly) twice as many decimal digits.
floatmat + 1
produce a numeric (double) matrix but floatmat + 1L
produce a float matrix?Type promotion always defaults to the highest type available. If you want the arithmetic to be carried out in single precision, cast the 1
with fl(1)
first.
Yes.
If you can formulate the method in terms of existing functionality from the float package, then you're good. If not, you will likely have to write your own C/C++ code. See the For Developers section of the package vignette.