|
| 1 | +# The Structural EMBedding library (SEMB) |
| 2 | + |
| 3 | +**Authors: GEMS Lab Team @ University of Michigan** |
| 4 | + |
| 5 | +This SEMB library allows fast onboarding to explore structural embedding of graph data using hetereogenous methods, with a unified API interface and a modular codebase enabling easy intergration of 3rd party methods and datasets. |
| 6 | + |
| 7 | +The library itself has already included a set of popular methods and datasets ready for use immediately. |
| 8 | + |
| 9 | +The library requires *Python 3.7+*. |
| 10 | + |
| 11 | +## Getting started |
| 12 | + |
| 13 | +Make sure you are using *Python 3.7+* for all below! |
| 14 | + |
| 15 | +### Installation |
| 16 | +`python setup.py install` (TODO: Pip support will be added soon) |
| 17 | + |
| 18 | +### Import and load a dataset |
| 19 | +```py |
| 20 | +from semb.data import load, get_dataset_ids |
| 21 | +# explore all datasets (both built in and extended by 3rd party) |
| 22 | +ids = get_dataset_ids() |
| 23 | +# load a dataset |
| 24 | +graph = load(ids[0]) |
| 25 | +``` |
| 26 | + |
| 27 | +### Import and load a method |
| 28 | +```py |
| 29 | +from semb.methods import load, get_method_ids |
| 30 | +# explore all methods (both built in and extended by 3rd party) |
| 31 | +ids = ge_method_ids() |
| 32 | +# load a method, returns a constructor for a method's base class |
| 33 | +Method = load(ids[0]) |
| 34 | +# create and run a method. |
| 35 | +# NOTE: except for the first "graph" arg, everything other argument MUST be in keyword form! |
| 36 | +method = Method(graph, a=1, b=2, c=3, ...) |
| 37 | +method.train() |
| 38 | +embeddings = method.get_embeddings() |
| 39 | +``` |
| 40 | + |
| 41 | +## Extending SEMB |
| 42 | + |
| 43 | +First make sure the `semb` library is installed. |
| 44 | + |
| 45 | +### Developing 3rd party Dataset extension |
| 46 | + |
| 47 | +- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form of `semb-dataset[$YOUR_CHOSEN_DATASET_ID]` |
| 48 | +- Within the package root directory, make sure `__init__.py` is present |
| 49 | +- Create a `dataset.py` and make a `Method` class that inherits from `from semb.data import BaseDataset` and implement the required methods. See `src/datasets/airports/dataset.py` for more details. |
| 50 | +- Install the package via `setup.py` or pip. |
| 51 | +- Now the dataset is loadable by the main client program that uses `semb`! |
| 52 | + |
| 53 | +### Developing 3rd party Method extension |
| 54 | + |
| 55 | +- Create a Python 3.7+ [package](https://packaging.python.org/tutorials/packaging-projects/) with a name in form of `semb-method[$YOUR_CHOSEN_METHOD_ID]` |
| 56 | +- Within the package root directory, make sure `__init__.py` is present |
| 57 | +- Create a `dataset.py` and make a `Dataset` class that inherits from `from semb.data import BaseDataset` and implement the required methods. See `src/methods/node2vec/method.py` for more details. |
| 58 | +- Install the package via `setup.py` or pip. |
| 59 | +- Now the method is load-able by the main client program that uses `semb`! |
| 60 | + |
| 61 | +### Note |
| 62 | +For both `dataset` and `method` extensions, make sure the `get_id()` to be overridden and returns the same id as your chosen id in your package name. |
0 commit comments