A Growing Network of Characters In Marvel and DC Universes

Abstract

Marvel and DC Comics are the two most important comic books publishers in the world. Over the years, each of them told stories of more than 20 000 characters, developing their personalities and making them interact with each others to build two gigantic universes.

The relationships between the characters represent a huge potential for graph visualization: the characters are nodes, and the links between them show how close they are to each other. Studying these links over time will allow an overall visualization of the evolution of the two separate universes thoughout the years, which will show the emergence of some groups. The study will be made for three different periods of time: [beginning - 1950], [beginning - 1990], [beginning - today] ("beginning" means all the data since Marvel and DC started publishing comic books).

Dataset

The data for Marvel and DC were taken respectively from the following websites:

In each database, information on characters and on published comics was collected.

For the characters, the gathered attributes were: Name, Current Alias, Relatives, Affiliation (team, group or organization the character is affiliated to), URL of the webpage (used as a unique ID).

For the comic books, the gathered attributes were: Publication date, Subcomics names (if a comic book is made of multiple comics) Appearing characters with a URL (to make sure they're in the database).

The graphs are based on three attributes: the characters that are relatives, the ones that are linked by the same affiliation, and the ones appearing in the same comic book.

Repository

Notebooks

All the used notebooks for the data gathering and analysis.

parsing_data.ipynb: scraping of the databases
cleaning_data.ipynb: cleaning the obtained data
relatives_analysis.ipynb: generating the adjacency matrix based on the relatives
affiliation_analysis.ipynb: generating the adjacency matrix based on the affiliations
comics_analysis.ipynb: generating the adjacency matrix based on the comics appearances
adjacency_processing.ipynb: plotting the degree distributions of the adjacency matrices and generating the corresponding CSV files for the graph visualisation (Gephi software)

Data

Folder containing all the saved data.

characters_marvel.txt: Marvel characters and their attributes
comics_marvel.txt: Marvel comics and their attributes
clean_marvel.txt: Cleaned Marvel dataframe (characters + comics)
good_dc.txt: DC good characters and their attributes
bad_dc.txt: DC bad characters and their attributes
characters_dc.txt: DC characters and their attributes (good + bad)
comics_dc.txt: DC comics and their attributes
clean_dc.txt: cleaned DC dataframe (characters + comics)
https://drive.google.com/drive/folders/1fZhjABSlYjy7Pkc2HMnRkctInPVhlnGA?usp=sharing: rest of the data that was too large for git
- Adjacency matrices: adjacency matrices, the file names are adj_attribute_date. Attribute: aff means affiliations, relat means relatives, comics means comic books. Date: 50 means until 1950, 90 means until 1990, no date means all the dataset.
- CSV: made from the adjacency matrices to match the Gephi inputs. The name format is the same as the adjacency matrices, replacing adj with edge.

Report

team 28 REPORT.pdf: final report

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

data

data

.DS_Store

.DS_Store

README.md

README.md

Team 28 REPORT.pdf

Team 28 REPORT.pdf

adjacency_processing.ipynb

adjacency_processing.ipynb

affiliation_analysis.ipynb

affiliation_analysis.ipynb

cleaning_data.ipynb

cleaning_data.ipynb

comics_analysis.ipynb

comics_analysis.ipynb

parsing_data.ipynb

parsing_data.ipynb

relatives_analysis.ipynb

relatives_analysis.ipynb

Repository files navigation

A Growing Network of Characters In Marvel and DC Universes

Abstract

Dataset

Repository

Notebooks

Data

Report

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
.DS_Store		.DS_Store
README.md		README.md
Team 28 REPORT.pdf		Team 28 REPORT.pdf
adjacency_processing.ipynb		adjacency_processing.ipynb
affiliation_analysis.ipynb		affiliation_analysis.ipynb
cleaning_data.ipynb		cleaning_data.ipynb
comics_analysis.ipynb		comics_analysis.ipynb
parsing_data.ipynb		parsing_data.ipynb
relatives_analysis.ipynb		relatives_analysis.ipynb

ahmedkooli/ntds_comics

Folders and files

Latest commit

History

Repository files navigation

A Growing Network of Characters In Marvel and DC Universes

Abstract

Dataset

Repository

Notebooks

Data

Report

About

Resources

Stars

Watchers

Forks

Languages