Awesome Open Source
Awesome Open Source

Read, Love, Pray Awesome

This is a list of papers 📑 that I particularly enjoy and would love to share with you. If you have papers that you want to share, just put it down as an issue and I will update the list!

Table of Contents

     Machine Learning


     Distributed System

Machine Learning

  • A Few Useful Things to Know about Machine Learning

    This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. While these lessons are simple to understand, I learn something new everytime I read it.

  • Deep learning

    This is a review paper by Yan LeCun et al. and it gives a good general picture about deep learning. In my opinion, what a wikipedia page should look like after the summary section.

  • Dropout

    Regularization is a very important component in machine learning. This paper in particular, talks about the regularization challenge in deep learning. While deep learning in some way mimic human's brain strucutre with layers of process, it does not provide the flexibility in structure as human brain do. This paper describes a simple way to create a robust deep learning model that allow some of the "neurons" turn down during the reading. What is particularly beautiful about this paper is that it starts with a biology motivation and end with a great mathmatical solution for machine learning.

  • Go Game and Deep Learning

    This is a very high profile paper. While the concepts in the paper are not new, the way Deepmind teams put them together is an engineering gem. Plus, it is a interesting challenge as Go is not an easy game...

  • Distilling the Knowledge in a Neural Network

    Simple tricks can help improve performance significantly but they usually come from insightful observation.

  • Generative Adversarial Nets

    So Jerry is trying to generate images to trick Tom... This is a theoretical sound paper shedding lights towards a new path to deep learning. Let's generate and make inference while rationalizing the network.

  • Steerable CNNs

    Rationalize and understand CNN via group theory. It's math heavy, but isn't ML all about math :)

  • Tree LSTM

    A paper about generativing sentence embedding via a LSTM with tree structure. I like this paper because it gives a great intro to the related work and generalizes to the classical sequential LSTM. Beautifullly formulated...


Distributed System

  • Paxos made simple

    This is not a paper but a very simple and clear explanation for Paxos, which is considered to be difficult to understand. I think Paxos is a super neat idea to replicate nodes and imagine a bunch machine vote about what they want to do together is hilarious. While not very efficient communication wise, forcing partition overlap using majority is so simple in math and so powerful in system.

  • Dynamo

    Dynamo is an interesting paper for me because it shows what engineering really is. Amazon has a unique need to make sure all the put operation to went through with a 99.9% SLA and they specifically design a system for it. This is also a very well written paper with good description of related work and their motivation, which shows a great deal of engineering mind set. The ring hash + vitrual node idea is also very cute and actually inspire some other interesting distributes system @ Uber for realtime dispatching, which again, is an unique problem with a specifically engineered solution.

  • ZooKeeper

    Love open source work and good modulization! While there are other synchronized core out there, I find reading this paper entertaining because it shows how we can use a simple primitive to implement other complex system.

  • ZooNet/Composition of Synchronized Core

    This is a relatively new paper continue the topic of ZooKeeper. While synchornized core is simple to use, it is hard to put them together and have multiple applications with their own ZooKeeper instance synchornizing with each other. However, this paper used a incredibly simple client-layer code to fix the problem (150 lines of code) and that really captures the beauty of solving system problem. Moreover, they find a "bug" in the ZooKeeper after 3.x version iteration and improve the performance, again, with some very simple code/idea.

  • MapReduce

    I am a big fan of functional programing so reading this paper is like an orgasm with the "map" and "reduce" function. It is neat that Jeff Dean and his wingman, Sanjay Ghemawat, decided to think of distributed computing problem as map and reduce and endup modulized the idea with cute engineering refinement/discussion.

  • Spanner

    Another paper from Google Research! This is a super complicated system including Paxos, time clock etc. However, it really push the boundary about what's the best we can do with the most advanced technology and knowledge in system's world. Its a good engineering lesson to see how all the piecies are putting together and make sense.

  • Farm

    Well, yet another paper yells "We want it all!". This paper is super cool with its "one-sided" disk read (i.e. control other computers' disk read/write to memory without interupting their cpu) and memory cahce database (which is possible with a special battery that can move memory into disk during sudden power failure). We want it all!

Related Awesome Lists
Top Programming Languages

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Learning (75,678
Machine Learning (40,813
Deep (39,256
Deep Learning (39,256
Paper (18,768
Awesome List (13,875
Zookeeper (3,631
Love (3,471
Distributed Systems (1,883
Neuroscience (1,109
Paxos (387