Awesome Open Source
Awesome Open Source
  • Improve and analyse performance of your neural network (e.g. Tensor Cores compatibility)
  • Record/analyse internal state of torch.nn.Module as data passes through it
  • Do the above based on external conditions (using single Callable to specify it)
  • Day-to-day neural network related duties (model size, seeding, time measurements etc.)
  • Get information about your host operating system, torch.nn.Module device, CUDA capabilities etc.
Version Docs Tests Coverage Style PyPI Python PyTorch Docker Roadmap
Version Documentation Tests Coverage codebeat PyPI Python PyTorch Docker Roadmap

💡 Examples

Check documentation here:

1. Getting performance tips

  • Get instant performance tips about your module. All problems described by comments will be shown by
class Model(torch.nn.Module):
    def __init__(self):
        self.convolution = torch.nn.Sequential(
            torch.nn.Conv2d(1, 32, 3),
            torch.nn.ReLU(inplace=True),  # Inplace may harm kernel fusion
            torch.nn.Conv2d(32, 128, 3, groups=32),  # Depthwise is slower in PyTorch
            torch.nn.ReLU(inplace=True),  # Same as before
            torch.nn.Conv2d(128, 250, 3),  # Wrong output size for TensorCores

        self.classifier = torch.nn.Sequential(
            torch.nn.Linear(250, 64),  # Wrong input size for TensorCores
            torch.nn.ReLU(),  # Fine, no info about this layer
            torch.nn.Linear(64, 10),  # Wrong output size for TensorCores

    def forward(self, inputs):
        convolved = torch.nn.AdaptiveAvgPool2d(1)(self.convolution(inputs)).flatten()
        return self.classifier(convolved)

# All you have to do

2. Seeding, weight freezing and others

  • Seed globaly (including numpy and cuda), freeze weights, check inference time and model size:
# Inb4 MNIST, you can use any module with those functions
model = torch.nn.Linear(784, 10)
frozen = torchfunc.module.freeze(model, bias=False)

with torchfunc.Timer() as timer:
  frozen(torch.randn(32, 784)
  print(timer.checkpoint()) # Time since the beginning
  frozen(torch.randn(128, 784)
  print(timer.checkpoint()) # Since last checkpoint

print(f"Overall time {timer}; Model size: {torchfunc.sizeof(frozen)}")

3. Record torch.nn.Module internal state

  • Record and sum per-layer activation statistics as data passes through network:
# Still MNIST but any module can be put in it's place
model = torch.nn.Sequential(
    torch.nn.Linear(784, 100),
    torch.nn.Linear(100, 50),
    torch.nn.Linear(50, 10),
# Recorder which sums all inputs to layers
recorder = torchfunc.hooks.recorders.ForwardPre(reduction=lambda x, y: x+y)
# Record only for torch.nn.Linear
recorder.children(model, types=(torch.nn.Linear,))
# Train your network normally (or pass data through it)
# Activations of all neurons of first layer!
print(recorder[1]) # You can also post-process this data easily with apply

For other examples (and how to use condition), see documentation

🔧 Installation

🐍 pip

Latest release:

pip install --user torchfunc


pip install --user torchfunc-nightly

🐋 Docker

CPU standalone and various versions of GPU enabled images are available at dockerhub.

For CPU quickstart, issue:

docker pull szymonmaszke/torchfunc:18.04

Nightly builds are also available, just prefix tag with nightly_. If you are going for GPU image make sure you have nvidia/docker installed and it's runtime set.

❓ Contributing

If you find any issue or you think some functionality may be useful to others and fits this library, please open new Issue or create Pull Request.

To get an overview of things one can do to help this project, see Roadmap.

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (54,408
docker (2,914
pytorch (2,375
neural-network (744
performance (610
utilities (107
performance-analysis (83
utils (80
extensions (72
functions (55
tips (54
recording (33
record (29