avrc :3
This commit is contained in:
parent
0f65a7e86c
commit
a54a129866
5 changed files with 326744 additions and 0 deletions
24
anno3/avrc/assignments/coff/git_web_ml/README.txt
Normal file
24
anno3/avrc/assignments/coff/git_web_ml/README.txt
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
GitHub Social Network
|
||||||
|
|
||||||
|
Description
|
||||||
|
|
||||||
|
A large social network of GitHub developers which was collected from the public API in June 2019. Nodes are developers who have starred at least 10 repositories and edges are mutual follower relationships between them. The vertex features are extracted based on the location, repositories starred, employer and e-mail address. The task related to the graph is binary node classification - one has to predict whether the GitHub user is a web or a machine learning developer. This target feature was derived from the job title of each user.
|
||||||
|
|
||||||
|
Properties
|
||||||
|
|
||||||
|
- Directed: No.
|
||||||
|
- Node features: Yes.
|
||||||
|
- Edge features: No.
|
||||||
|
- Node labels: Yes. Binary-labeled.
|
||||||
|
- Temporal: No.
|
||||||
|
- Nodes: 37,700
|
||||||
|
- Edges: 289,003
|
||||||
|
- Density: 0.001
|
||||||
|
- Transitvity: 0.013
|
||||||
|
|
||||||
|
Possible Tasks
|
||||||
|
|
||||||
|
- Binary node classification
|
||||||
|
- Link prediction
|
||||||
|
- Community detection
|
||||||
|
- Network visualization
|
14
anno3/avrc/assignments/coff/git_web_ml/citing.txt
Normal file
14
anno3/avrc/assignments/coff/git_web_ml/citing.txt
Normal file
|
@ -0,0 +1,14 @@
|
||||||
|
If you find this dataset useful in your research, please consider citing the following paper:
|
||||||
|
|
||||||
|
>@misc{rozemberczki2019multiscale,
|
||||||
|
title = {Multi-scale Attributed Node Embedding},
|
||||||
|
author = {Benedek Rozemberczki and Carl Allen and Rik Sarkar},
|
||||||
|
year = {2019},
|
||||||
|
eprint = {1909.13021},
|
||||||
|
archivePrefix = {arXiv},
|
||||||
|
primaryClass = {cs.LG}
|
||||||
|
}
|
||||||
|
|
||||||
|
And take a look at the project itself:
|
||||||
|
|
||||||
|
https://github.com/benedekrozemberczki/MUSAE
|
289004
anno3/avrc/assignments/coff/git_web_ml/musae_git_edges.csv
Normal file
289004
anno3/avrc/assignments/coff/git_web_ml/musae_git_edges.csv
Normal file
File diff suppressed because it is too large
Load diff
File diff suppressed because one or more lines are too long
37701
anno3/avrc/assignments/coff/git_web_ml/musae_git_target.csv
Normal file
37701
anno3/avrc/assignments/coff/git_web_ml/musae_git_target.csv
Normal file
File diff suppressed because it is too large
Load diff
Loading…
Reference in a new issue