lamppp
|
Let's get you up and running! This should take about 10 minutes if everything goes smoothly, or maybe 30 minutes if you hit some bumps along the way.
Required:**
cmake --version
)OpenMP (usually comes with your compiler)
Optional but recommended:**
Python 3.11+ (for running tests and examples)
Quick compatibility check:**
That's it for a CPU-only build. Everything should compile in under a minute.
If you have a NVIDIA GPU and want to use it:
The build system will auto-detect your GPU architecture, so you don't need to worry about compute capabilities.
Release build (fast, no debug info):**
Debug build (slower, with debug symbols):**
With code coverage (for contributors):**
The Release build includes -march=native
and -ffast-math
, so it's optimized for your specific CPU. Debug builds include helpful debug symbols and assertions.
Create a file called test.cpp
:
To compile and run it:
Or if you built with CUDA:
The MNIST example is a good way to see everything working together:
First, get the data:**
Run the example:**
This trains a simple 2-layer neural network on MNIST. You should see training accuracy improving over time. The network gets to about 85-90% accuracy, which isn't state-of-the-art but shows that everything is working.
Basic test suite:**
Individual test suites:**
With verbose output:**
The tests cover all the basic tensor operations, gradient computation, and CUDA kernels (if enabled). They should all pass on a fresh build.
These will give you an idea of performance on your system. The benchmarks compare different operation implementations and should help identify any performance issues. You can also check the corresponding Pytorch benchmarks to see how the two libraries compare.
CUDA not found errors:**
nvcc --version
to verify installationYou can always build without CUDA using -DENABLE_CUDA=OFF
OpenMP linking errors:**
sudo apt install libomp-dev
On macOS: brew install libomp
Tests failing:**
rm -rf build && mkdir build && cd build && cmake .. && make