Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Pandas | 39,869 | 38,392 | 31,493 | 18 hours ago | 116 | June 28, 2023 | 3,632 | bsd-3-clause | Python | |
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more | ||||||||||
Data Science For Beginners | 22,590 | 5 days ago | 44 | mit | Jupyter Notebook | |||||
10 Weeks, 20 Lessons, Data Science for All! | ||||||||||
Ydata Profiling | 11,186 | 80 | 106 | 3 days ago | 40 | February 03, 2023 | 194 | mit | Python | |
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames. | ||||||||||
Pandas_exercises | 9,277 | 25 days ago | 30 | bsd-3-clause | Jupyter Notebook | |||||
Practice your pandas skills! | ||||||||||
Mlcourse.ai | 8,803 | 4 months ago | 4 | other | Python | |||||
Open Machine Learning Course | ||||||||||
Pandas Ai | 8,624 | a day ago | 111 | mit | Python | |||||
PandasAI is the Python library that integrates Gen AI into pandas, making data analysis conversational | ||||||||||
Pygwalker | 7,421 | 3 | 21 hours ago | 72 | August 03, 2023 | 22 | apache-2.0 | Python | ||
PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis | ||||||||||
Ai Learn | 6,991 | a year ago | 19 | |||||||
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域 | ||||||||||
Cudf | 5,970 | 2 | 16 hours ago | 29 | June 08, 2023 | 945 | apache-2.0 | C++ | ||
cuDF - GPU DataFrame Library | ||||||||||
Datasciencepython | 4,776 | 5 months ago | 11 | mit | Python | |||||
common data analysis and machine learning tasks using python |
Course in data science. Learn to analyze data of all types using the Python programming language. No programming experience is necessary.
Quick links: 📁 lessons ⏬ Lesson Schedule
Software covered:
Course topics include:
O'Reilly Media titles are free to UCSD affiliates with Safari Books Online.
Weekly take-home assignments will follow the course schedule, reinforcing skills with exercises to analyze and visualize scientific data. Assignments will given out on Wednesdays and will be due the following Wednesday, using TritonEd. Assignments are worth 8 points each and will be graded on effort, completeness, and accuracy.
You will choose a dataset of your own or provided in one of the texts and write a Python program (or set of Python programs or mixture of .ipynb and .py/.sh scripts) to carry out a revealing data analysis or create a software tool. Have a look at Shaw Ex43-52 and McKinney Ch10-12 for more ideas. The final project is worth 20 points and will be graded on effort, creativity, and fulfillment of the requirements below.
Requirements:
pandas
and one or more package from at least three (≥3) of the categories below:
matplotlib
, seaborn
bokeh
, pygal
, plotly
, mpld3
, nvd3
scipy
, statsmodels
, scikit-learn
scikit-bio
, biopython
cdms
, iris
There are 100 points total possible for the course:
Participation is based on completing the pre-course survey, showing up to class (when you are able), and completing the course evaluation (this is on the honor system as I won't know who completes it). There are no midterm or final exams.
The course consists of 20 lessons. As a class, it is taught as two lessons per week for 10 weeks, but the material can be covered at any pace.
Lessons 1-3 will be an introduction to the command line. By the end of this tutorial, everyone will be familiar with basic Unix commands.
Lessons 4-9 will be an introduction to programming using Python. The main text will be Shaw's Learn Python 3 the Hard Way. For those with experience in a programming language other than Python, Lutz's Learning Python will provide a more thorough introduction to programming Python. We will learn to use IPython and IPython Notebooks (also called Jupyter Notebooks), a much richer Python experience than the Unix command line or Python interpreter.
Lessons 10-18 will focus on Python packages for data analysis. We will work through McKinney's Python for Data Analysis, which is all about analyzing data, doing statistics, and making pretty plots. You may find that Python can emulate or exceed much of the functionality of R and MATLAB.
Lessons 19-20 conclude the course with two skills useful in developing code: writing your own classes and modules, and sharing your code on GitHub.
Lessons are available as .md or .ipynb files by clicking on the lesson numbers below. Readings should be completed while typing out the code (this is integral to the Shaw readings) and doing any Study Drills (Shaw) and Chapter Quizzes (Lutz).
Lesson | Title | Readings | Topics | Assignment |
---|---|---|---|---|
1 | Overview | -- | Introductions and overview of course | Pre-course survey; Acquire texts |
2 | Command Line Part I | Shaw: Introduction, Ex0, Appendix A |
Command line crash course; Text editors | Assignment 1: Basic Shell Commands |
3 | Command Line Part II | Yale: The 10 Most Important Linux Commands | Advanced commands in the bash shell | -- |
4 | Conda, IPython, and Jupyter Notebooks | Geohackweek: Introduction to Conda | Conda tutorial including Conda environments, Python packages, and PIP; Python and IPython in the command line; Jupyter notebook tutorial; Python crash course | Assignment 2: Bash, Conda, IPython, and Jupyter |
5 | Python Basics, Strings, Printing | Shaw: Ex1-10; Lutz: Ch1-7 | Python scripts, error messages, printing strings and variables, strings and string operations, numbers and mathematical expressions, getting help with commands and Ipython | -- |
6 | Taking Input, Reading and Writing Files, Functions | Shaw: Ex11-26; Lutz: Ch9,14-17 | Taking input, reading files, writing files, functions | Assignment 3: Python Fundamentals I |
7 | Logic, Loops, Lists, Dictionaries, and Tuples | Shaw: Ex27-39; Lutz: Ch8-13 | Logic and loops, lists and list comprehension, tuples, dictionaries, other types | -- |
8 | Python and IPython Review | McKinney: Ch1, Ch2, Ch3 | Review of Python commands, IPython review | Assignment 4: Python Fundamentals II |
9 | Regular Expressions | Kuchling: Regular Expression HOWTO | Regular expression syntax, Command-line tools: grep , sed , awk , perl -e , Python examples: built-in and re module |
-- |
10 | Numpy, Pandas and Matplotlib Crashcourse | Pratik: Introduction to Numpy and Pandas | Numpy, Pandas, and Matplotlib overview | Assignment 5: Regular Expressions |
11 | Pandas Part I | McKinney: Ch4, Ch5 | Introduction to NumPy and Pandas: ndarray , Series , DataFrame , index , columns , dtypes , info , describe , read_csv , head , tail , loc , iloc , ix , to_datetime
|
-- |
12 | Pandas Part II | McKinney: Ch6, Ch7, Ch8 | Data Analysis with Pandas: concat , append , merge , join , set_option , stack , unstack , transpose , dot-notation, values , apply , lambda , sort_index , sort_values , to_csv , read_csv , isnull
|
Assignment 6: Pandas Fundamentals |
13 | Plotting with Matplotlib | McKinney: Ch9; Johansson: Matplotlib 2D and 3D plotting in Python | Matplotlib tutorial from J.R. Johansson | -- |
14 | Plotting with Seaborn | Seaborn Tutorial | Seaborn tutorial from Michael Waskom | Assignment 7: Plotting |
15 | Pandas Time Series | McKinney: Ch11 | Time series data in Pandas | -- |
16 | Pandas Group Operations | McKinney: Ch10 |
groupby , melt , pivot , inplace=True , reindex
|
Assignment 8: Time Series and Group Operations |
17 | Statistics Packages | Handbook of Biological Statistics | Statistics capabilities of Pandas, Numpy, Scipy, and Scikit-bio | -- |
18 | Interactive Visualization with Bokeh | Bokeh User Guide | Quickstart guide to making interactive HTML and notebook plots with Bokeh | Assignment 9: Statistics and Interactive Visualization |
19 | Modules and Classes | Shaw: Ex40-52 | Packaging your code so you and others can use it again | -- |
20 | Git and GitHub | GitHub Guides | Sharing your code in a public GitHub repository | Final Project |