Dye Score¶
Utilities to build the dye-score metric from OpenWPM javascript call data.
- GitHub repo: https://github.com/mozilla/dye-score/
- Documentation: https://dyescore.readthedocs.io/
- Free software: Mozilla Public License
Quickstart¶
Install dye score: conda install dye-score -c conda-forge
or pip install dye-score
Review usage notebook: https://github.com/mozilla/dye-score/blob/master/docs/usage.ipynb
Documentation Contents¶
Installation¶
Prerequisites¶
You will need Apache Spark including PySpark available on your system.
From sources¶
The sources for Dye Score can be downloaded from the Github repo.
You can either clone the public repository:
$ git clone git://github.com/mozilla/dye_score
Or download the tarball:
$ curl -OL https://github.com/mozilla/dye_score/tarball/master
Once you have a copy of the source, you can install it with:
$ python setup.py install
Usage¶
This notebook runs through using the dye score library and methodology to score scripts.
The input data is generated by OpenWPM. A dataset that has been used with the dye score is available at github.com/mozilla/overscripted
This notebook was run on a small sample.
Dye Score expects a spark context to be available for thie initial data processing steps.
Additionally, set-up a Dask Client however you choose to. The below cell was generated by Dask’s JupyterLab extension.
Note the warning is known by the dask team (https://github.com/dask/distributed/issues/2564).
[1]:
from dask.distributed import Client
client = Client("tcp://127.0.0.1:32829")
client
/home/bird/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
data = yaml.load(f.read()) or {}
/home/bird/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/distributed/config.py:20: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
defaults = yaml.load(f)
[1]:
Client
|
Cluster
|
[2]:
import dask.dataframe as dd
import numpy as np
from dye_score import DyeScore
[3]:
ds = DyeScore('config.yaml', print_config=False)
[4]:
ds.validate_input_data()
[4]:
True
[5]:
df = ds.get_input_df()
df.head()
[5]:
top_level_url | script_url | func_name | symbol | |
---|---|---|---|---|
0 | https://7ero.org/ | https://forsiteid6346.tech/convert/scripts/cre... | b.exec | CanvasRenderingContext2D.fillStyle |
1 | https://www.stevinsonhyundai.com/ | https://tag.contactatonce.com/le_secure_storag... | r | window.Storage.setItem |
2 | https://www.thecircle.com/us/ | https://www.thecircle.com/k3/ruxitagentjs_ICA2... | fc | window.document.cookie |
3 | https://www.jcpportraits.com/ | https://cdn.optimizely.com/js/8447592883.js | be/< | window.Storage.length |
4 | https://www.technik-profis.de/ | https://cdn.optimizely.com/js/8323142798.js | t.getUserAgent | window.navigator.userAgent |
[6]:
print(f'This sample is {len(df):,} rows')
This sample is 2,312,697 rows
Data Preparation¶
[7]:
%time ds.build_raw_snippet_df()
top_level_url \
0 https://7ero.org/
1 https://www.stevinsonhyundai.com/
2 https://www.thecircle.com/us/
3 https://www.jcpportraits.com/
4 https://www.technik-profis.de/
script_url func_name \
0 https://forsiteid6346.tech/convert/scripts/cre... b.exec
1 https://tag.contactatonce.com/le_secure_storag... r
2 https://www.thecircle.com/k3/ruxitagentjs_ICA2... fc
3 https://cdn.optimizely.com/js/8447592883.js be/<
4 https://cdn.optimizely.com/js/8323142798.js t.getUserAgent
symbol \
0 CanvasRenderingContext2D.fillStyle
1 window.Storage.setItem
2 window.document.cookie
3 window.Storage.length
4 window.navigator.userAgent
raw_snippet called
0 forsiteid6346.tech||createjs-2015.11.26.min.js... 1
1 tag.contactatonce.com||storage.secure.min.html||r 1
2 www.thecircle.com||ruxitagentjs_ICA27SVfhjoqrx... 1
3 cdn.optimizely.com||8447592883.js||be/< 1
4 cdn.optimizely.com||8323142798.js||t.getUserAgent 1
CPU times: user 121 ms, sys: 43 ms, total: 164 ms
Wall time: 39.7 s
[7]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_data/raw_snippet_call_df.parquet'
[8]:
%time ds.build_snippet_map()
raw_snippet snippet
0 forsiteid6346.tech||createjs-2015.11.26.min.js... 792826184637634903
1 tag.contactatonce.com||storage.secure.min.html||r -3182365903651065472
2 www.thecircle.com||ruxitagentjs_ICA27SVfhjoqrx... -9027005229756292155
3 cdn.optimizely.com||8447592883.js||be/< 2248811367515630966
4 cdn.optimizely.com||8323142798.js||t.getUserAgent -6265856453346281252
CPU times: user 81 ms, sys: 10.6 ms, total: 91.6 ms
Wall time: 1.71 s
[8]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_data/snippet_lookup.parquet'
The next two methods require your spark context to be available to pass to the methods.
[9]:
%time ds.build_snippets(spark)
/home/bird/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/pyarrow/__init__.py:159: UserWarning: pyarrow.open_stream is deprecated, please use pyarrow.ipc.open_stream
warnings.warn("pyarrow.open_stream is deprecated, please use "
Dataset has 216 unique symbols
<xarray.DataArray (snippet: 231057, symbol: 216)>
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
Coordinates:
* snippet (snippet) object '-1000043381057326421' ... '999736522860943363'
* symbol (symbol) object 'AnalyserNode.channelCount' ... 'window.sessionStorage'
CPU times: user 13.6 s, sys: 859 ms, total: 14.5 s
Wall time: 4min
[9]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_data/snippets.zarr'
[12]:
%time ds.build_snippet_snippet_dyeing_map(spark)
CPU times: user 39.1 ms, sys: 51.7 ms, total: 90.8 ms
Wall time: 9.8 s
[12]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_data/snippet_dyeing_map.parquet'
There are no more functions that depend on spark
Dyeing¶
Building list of dye snippets is up to user. Here we show an example using a keyword search for fingerprint
.
[4]:
snippet_dyeing_map_file = ds.dye_score_data_file('snippet_dyeing_map')
snippet_data = dd.read_parquet(snippet_dyeing_map_file, engine='pyarrow')
snippet_data.head()
[4]:
top_level_url | script_url | func_name | snippet | clean_script | |
---|---|---|---|---|---|
0 | http://narvalife.ucoz.net/ | https://usocial.pro/usocial/fingerprint2.min.js | e.prototype.getNavigatorPlatform | 4996125033026346492 | usocial.pro/usocial/fingerprint2.min.js |
1 | https://sletaem.by/ | https://sletaem.by/ | updateTimer | -6846198680163094774 | sletaem.by/ |
2 | http://realcoco.com/ | http://fs.bizspring.net/fsn/bstrk.1.js | _trkdp_getCookie | 2578583411096044764 | fs.bizspring.net/fsn/bstrk.1.js |
3 | https://www.trendydiscount.shop/ | https://www.google-analytics.com/analytics.js | zc | 1695113790766404014 | www.google-analytics.com/analytics.js |
4 | https://www.liveaquaria.com/ | https://www.youtube.com/yts/jsbin/player-vflYg... | hE | 2066756695033721030 | www.youtube.com/yts/jsbin/player-vflYgf3QU/en_... |
[4]:
key = 'fingerprint'
filename_suffix = f'{key}_keyword'
thresholds = [0.15, 0.2, 0.23, 0.24, 0.25, 0.26, 0.3, 0.35]
[6]:
script_snippets = snippet_data[snippet_data.clean_script.str.contains(key, case=False)].snippet.unique().astype(str)
funcname_snippets = snippet_data[snippet_data.func_name.str.contains(key, case=False)].snippet.unique().astype(str)
dye_snippets = np.unique(np.append(script_snippets, funcname_snippets))
With the dye snippets in hand we can now use the DyeScore library to compute the dye scores for a range of thresholds.
[8]:
%time ds.compute_distances_for_dye_snippets(dye_snippets=dye_snippets, filename_suffix=filename_suffix)
/home/bird/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/dask/array/blockwise.py:204: UserWarning: The da.atop function has moved to da.blockwise
warnings.warn("The da.atop function has moved to da.blockwise")
<xarray.DataArray 'data' (snippet: 231057, dye_snippet: 553)>
dask.array<shape=(231057, 553), dtype=float64, chunksize=(10000, 100)>
Coordinates:
* snippet (snippet) object '-1000043381057326421' ... '999736522860943363'
* dye_snippet (dye_snippet) object '-1006661115172174629' ... '917589267078160730'
CPU times: user 453 ms, sys: 120 ms, total: 573 ms
Wall time: 1min 6s
[8]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_dye_distances_from_fingerprint_keyword'
[9]:
%time ds.compute_snippets_scores_for_thresholds(thresholds, filename_suffix=filename_suffix)
CPU times: user 1.55 s, sys: 253 ms, total: 1.8 s
Wall time: 13.1 s
[9]:
['/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.15',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.2',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.23',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.24',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.25',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.26',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.3',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/snippets_score_from_fingerprint_keyword_0.35']
[5]:
%time ds.compute_dye_scores_for_thresholds(thresholds, filename_suffix=filename_suffix)
CPU times: user 10.3 s, sys: 415 ms, total: 10.7 s
Wall time: 57 s
[5]:
['/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.15.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.2.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.23.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.24.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.25.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.26.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.3.csv.gz',
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_from_fingerprint_keyword_0.35.csv.gz']
Evaluate scores¶
We now manually review the dye scores compared to the input dye list in order to select the best distance threshold.
The review process needs a list of clean script
scripts to compare to the dye score list to produce the following plot. The production of this list will be dependent on how the dye snippets list was prepared.
[10]:
import pandas as pd
from bokeh.io import export_png, show
from bokeh.layouts import gridplot
from dye_score.plotting import get_pr_plot
from IPython.display import Image
[8]:
snippet_data.head()
[8]:
top_level_url | script_url | func_name | snippet | clean_script | |
---|---|---|---|---|---|
0 | http://narvalife.ucoz.net/ | https://usocial.pro/usocial/fingerprint2.min.js | e.prototype.getNavigatorPlatform | 4996125033026346492 | usocial.pro/usocial/fingerprint2.min.js |
1 | https://sletaem.by/ | https://sletaem.by/ | updateTimer | -6846198680163094774 | sletaem.by/ |
2 | http://realcoco.com/ | http://fs.bizspring.net/fsn/bstrk.1.js | _trkdp_getCookie | 2578583411096044764 | fs.bizspring.net/fsn/bstrk.1.js |
3 | https://www.trendydiscount.shop/ | https://www.google-analytics.com/analytics.js | zc | 1695113790766404014 | www.google-analytics.com/analytics.js |
4 | https://www.liveaquaria.com/ | https://www.youtube.com/yts/jsbin/player-vflYg... | hE | 2066756695033721030 | www.youtube.com/yts/jsbin/player-vflYgf3QU/en_... |
[9]:
compare_list = snippet_data[snippet_data.snippet.isin(dye_snippets)].clean_script.unique().compute()
compare_list.head()
[9]:
0 usocial.pro/usocial/fingerprint2.min.js
1 script.hotjar.com/modules-ab5ba0ccf53ded68dfc9...
2 www.convertthepdf.co/js/landing.js
3 track.adabra.com/sbn_fingerprint.v1.16.47.min.js
4 www.bestwestern.com.br/modules/mod_rewards_but...
Name: clean_script, dtype: object
[10]:
%time plot_df_paths = ds.build_plot_data_for_thresholds(compare_list, thresholds, filename_suffix=filename_suffix)
CPU times: user 39.7 s, sys: 148 ms, total: 39.9 s
Wall time: 39.9 s
[7]:
plot_df_paths[0]
[7]:
'/home/bird/Dev/mozilla/overscripted-clustering/new_data_dye_package/test_dyescore_results/dye_score_plot_data_from_fingerprint_keyword_0.15.csv.gz'
[11]:
plots = []
plot_opts = dict(tools='', toolbar_location=None, width=300, height=200)
for threshold, pr_df_path in zip(thresholds, plot_df_paths):
pr_df = pd.read_csv(pr_df_path)
plots.append(get_pr_plot(pr_df, title=f'{threshold}', plot_opts=plot_opts))
Image(export_png(gridplot(plots, ncols=3, toolbar_location=None)))
[11]:

Remaining analysis is up to user based on their preferred distance threshold.
[ ]:
API¶
DyeScore¶
-
class
dye_score.dye_score.
DyeScore
(config_file_path, validate_config=True, print_config=False, sc=None)[source]¶ Parameters: - config_file_path (str) –
The path of your config file that is used for dye score to interact with your environment. Holds references to file paths and private data such as AWS API keys. Expects a YAML file with the following keys:
- INPUT_PARQUET_LOCATION - the location of the raw or sampled OpenWPM input parquet folder
- DYESCORE_DATA_DIR - location where you would like dye score to store data assets
- DYESCORE_RESULTS_DIR - location where you would like dye score to store results assets
- USE_AWS - default False - set true if data store is AWS
- AWS_ACCESS_KEY_ID - optional - for storing and retrieving data on AWS
- AWS_SECRET_ACCESS_KEY - optional - for storing and retrieving data on AWS
- SPARK_S3_PROTOCOL - default ‘s3’ - only s3 or s3a are used
- PARQUET_ENGINE - default ‘pyarrow’ - pyarrow or fastparquet
Locations can be a local file path or a bucket.
- validate_config (bool, optional) – Run
DyeScore.validate_config
method. Defaults toTrue
. - print_config (bool, optional) – Print out config once saved. Defaults to
False
. - sc (SparkContext, optional) – If accessing s3 via s3a, pass spark context to set aws credentials
-
build_plot_data_for_thresholds
(compare_list, thresholds, leaky_threshold, filename_suffix='dye_snippets', override=False)[source]¶ Builds a dataframe for evaluation
Contains the recall compared to the
compare_list
for scripts under the threshold.Parameters: - compare_list (list) – List of dye scripts to compare for recall.
- thresholds (list) – List of distances to compute snippet scores for e.g.
[0.23, 0.24, 0.25]
- filename_suffix (str, optional) – Change to differentiate between dye_snippet sets. Defaults to
dye_snippets
- override (bool, optional) – Override output files. Defaults to
False
.
Returns: list. Paths results were written to
-
build_raw_snippet_df
(override=False, snippet_func=None)[source]¶ Builds raw_snippets from input data
Default snippet function is
script_url.netloc||script_url.path_end||func_name
If script_url is missing, location is used.Parameters: - override (bool) – True to replace any existing outputs
- snippet_func (function) – Function that accepts row of data as input and computes the snippet value. Default provided.
Returns: str. The file path where output is saved
-
build_snippet_map
(override=False)[source]¶ Builds snippet ids and saves map of ids to raw snippets
xarray cannot handle arbitrary length string indexes so we need to build a set of unique ids to reference snippets. This method creates the ids and saves the map of raw ids to snippets.
Parameters: override (bool, optional) – True to replace any existing outputs. Defaults to False
Returns: str. The file path where output is saved
-
build_snippet_snippet_dyeing_map
(spark, override=False)[source]¶ Build file used to join snippets to data for dyeing.
- Adds clean_script field to dataset. Saves parquet file with:
- snippet - the int version, not raw_snippet
- document_url
- script_url
- clean_script
Parameters: - spark (pyspark.sql.session.SparkSession) – spark instance
- override (bool, optional) – True to replace any existing outputs. Defaults to
False
Returns: str. The file path where output is saved
-
build_snippets
(spark, na_value=0, override=False)[source]¶ Builds row-normalized snippet dataset
- Dimensions are n snippets x s unique symbols in dataset.
- Data is output in zarr format with processing by spark, dask, and xarray.
- Creates an intermediate tmp file when converting from spark to dask.
- Slow running operation - follow spark and dask status to see progress
We use spark here because dask cannot memory efficiently compute a pivot table. This is the only function we need spark context for.
Parameters: - spark (pyspark.sql.session.SparkSession) – spark instance
- na_value (int, optional) – The value to fill vector where there’s no call. Defaults to
0
. - override (bool, optional) – True to replace any existing outputs. Defaults to
False
Returns: str. The file path where output is saved
-
compute_distances_for_dye_snippets
(dye_snippets, filename_suffix='dye_snippets', snippet_chunksize=1000, dye_snippet_chunksize=1000, distance_function='chebyshev', override=False, **kwargs)[source]¶ Computes all pairwise distances from dye snippets to all other snippets.
- Expects snippets file to exist.
- Writes results to zarr with name
snippets_dye_distances_from_{filename_suffix}
- This is a long-running function - see dask for progress
Parameters: - dye_snippets (np.array) – Numpy array of snippets to be dyed. Must be a subset of snippets index.
- filename_suffix (str, optional) – Change to differentiate between dye_snippet sets. Defaults to
dye_snippets
- snippet_chunksize (int, optional) – Set the chunk size for snippet xarray input, i
along the snippet dimension (not the symbol dimension). Defaults to
1000
. - dye_snippet_chunksize (int, optional) – Set the chunk size for dye snippet xarray input,
along the snippet dimension (not the symbol dimension). Defaults to
1000
. - distance_function (string or function, optional) – Provide a function to compute distances or a string
to use a built-in distance function. See
dye_score.distances.py
for template for example distance functions. Default is"chebyshev"
. Alternatives are cosine, jaccard, cityblock. - override (bool, optional) – Override output files. Defaults to
False
. - kwargs – kwargs to pass to distance function if required e.g. mahalanobis requires vi
Returns: str. Path results were written to
-
compute_dye_scores_for_thresholds
(thresholds, leaky_threshold, filename_suffix='dye_snippets', override=False)[source]¶ Get dye scores for a range of distance thresholds.
- Uses results from
compute_snippets_scores_for_thresholds
- Writes results to gzipped csv files with name
dye_score_from_{filename_suffix}_{threshold}.csv.gz
Parameters: - thresholds (list) – List of distances to compute snippet scores for e.g.
[0.23, 0.24, 0.25]
- filename_suffix (str, optional) – Change to differentiate between dye_snippet sets. Defaults to
dye_snippets
- override (bool, optional) – Override output files. Defaults to
False
.
Returns: list. Paths results were written to
- Uses results from
-
compute_leaky_snippet_data
(thresholds_to_test, filename_suffix='dye_snippets', override=False)[source]¶ Compute leaky percentages for a range of thresholds. This enables user to select the “leaky threshold” for following rounds.
- Writes results to parquet files with name
leak_test_{filename_suffix}_{threshold}
Parameters: - thresholds_to_test (list) – List of distances to compute percentage of snippets
dyed at for e.g.
[0.23, 0.24, 0.25]
- filename_suffix (str, optional) – Change to differentiate between dye_snippet sets. Defaults to
dye_snippets
- override (bool, optional) – Override output files. Defaults to
False
.
Returns: list. Paths results were written to
- Writes results to parquet files with name
-
compute_snippets_scores_for_thresholds
(thresholds, leaky_threshold, filename_suffix='dye_snippets', override=False)[source]¶ Get score for snippets for a range of distance thresholds.
- Writes results to parquet files with name
snippets_score_from_{filename_suffix}_{threshold}
Parameters: - thresholds (list) – List of distances to compute snippet scores for e.g.
[0.23, 0.24, 0.25]
- leaky_threshold (float) – Remove all snippets which dye more than this fraction of all other snippets.
- filename_suffix (str, optional) – Change to differentiate between dye_snippet sets. Defaults to
dye_snippets
- override (bool, optional) – Override output files. Defaults to
False
.
Returns: list. Paths results were written to
- Writes results to parquet files with name
-
config
(option)[source]¶ Method to retrieve config values
Parameters: option (str) – The desired config option key Returns: The config option value
-
dye_score_data_file
(filename)[source]¶ Helper function to return standardized filename.
DyeScore class holds a dictionary to standardize the file names that DyeScore saves. This method looks up filenames by their short name.
Parameters: filename (str) – data file name Returns: str. The path where the data file should reside
-
file_in_validation
(inpath)[source]¶ Check path exists.
Raises ValueError if not. Used for input files, as these must exist to proceed. :param inpath: Path of input file :type inpath: str
-
file_out_validation
(outpath, override)[source]¶ Check path exists. Raises ValueError if override is False. Otherwises removes the existing file. :param outpath: Path of ourput file. :type outpath: str :param override: Whether to raise an error or remove existing data. :type override: bool
-
from_parquet_opts
¶ Options used when saving to parquet.
-
get_input_df
(columns=None)[source]¶ Helper function to return the input dataframe.
Parameters: columns (list, optional) – List of columns to retrieve. If None, all columns are returned. Returns: dask.DataFrame. Input dataframe with subset of columns requested.
-
s3_storage_options
¶ s3 storage options built from config
Returns: dict. if USE_AWS is True returns s3 options as dict, else None.
-
to_parquet_opts
¶ Options used when saving to parquet.
- config_file_path (str) –
Plotting utils¶
The following plotting utils can be used directly or maybe useful template code for reviewing your results.
-
dye_score.plotting.
get_plots_for_thresholds
(ds, thresholds, leaky_threshold, n_scripts_range, filename_suffix='dye_snippets', y_range=(0, 1), recall_color='black', n_scripts_color='firebrick', **extra_plot_opts)[source]¶
Contributing¶
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
You can contribute in many ways:
Types of Contributions¶
Report Bugs¶
Report bugs at https://github.com/birdsarah/dye_score/issues.
If you are reporting a bug, please include:
- Your operating system name and version.
- Any details about your local setup that might be helpful in troubleshooting.
- Detailed steps to reproduce the bug.
Fix Bugs¶
Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.
Implement Features¶
Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.
Write Documentation¶
Dye Score could always use more documentation, whether as part of the official Dye Score docs, in docstrings, or even on the web in blog posts, articles, and such.
Submit Feedback¶
The best way to send feedback is to file an issue at https://github.com/birdsarah/dye_score/issues.
If you are proposing a feature:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that contributions are welcome :)
Get Started!¶
Ready to contribute? Here’s how to set up dye_score for local development.
Fork the dye_score repo on GitHub.
Clone your fork locally:
$ git clone git@github.com:your_name_here/dye_score.git
Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:
$ mkvirtualenv dye_score $ cd dye_score/ $ python setup.py develop
Create a branch for local development:
$ git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:
$ flake8 dye_score tests $ python setup.py test or py.test $ tox
To get flake8 and tox, just pip install them into your virtualenv.
Commit your changes and push your branch to GitHub:
$ git add . $ git commit -m "Your detailed description of your changes." $ git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Pull Request Guidelines¶
Before you submit a pull request, check that it meets these guidelines:
- The pull request should include tests.
- If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in README.rst.
- The pull request should work for Python 2.7, 3.4, 3.5 and 3.6, and for PyPy. Check https://travis-ci.org/birdsarah/dye_score/pull_requests and make sure that the tests pass for all supported Python versions.
Deploying¶
A reminder for the maintainers on how to deploy. Make sure all your changes are committed (including an entry in HISTORY.rst). Then run:
$ bumpversion patch # possible: major / minor / patch
$ git push
$ git push --tags
$ make release
History¶
Community Participation Guidelines¶
This repository is governed by Mozilla’s code of conduct and etiquette guidelines. For more details, please read the Mozilla Community Participation Guidelines.
How to Report¶
For more information on how to report violations of the Community Participation Guidelines, please read our How to Report page.