Skip to content

Commit 341171b

Browse files
committed
update FAQ
1 parent f2863c7 commit 341171b

1 file changed

Lines changed: 85 additions & 0 deletions

File tree

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
title: "Model License - FAQ"
3+
description: ""
4+
lead: ""
5+
date: 2020-11-12T15:22:20+01:00
6+
lastmod: 2020-11-12T15:22:20+01:00
7+
draft: false
8+
images: []
9+
menu:
10+
docs:
11+
parent: "pages"
12+
weight: 620
13+
toc: true
14+
---
15+
16+
We are releasing the first set of BigCode models, which are going to be licensed under the CodeML OpenRAIL-M 0.1 license, as we initially stated [here](https://www.bigcode-project.org/docs/about/ip/) and in our membership form. The CodeML OpenRAIL-M 0.1 is an interim version of the license that is being drafted for the release of BigCode in March 2023. This license is an open and responsible AI license (OpenRAIL).
17+
18+
## What is an OpenRAIL license?
19+
Open Responsible AI Licenses (OpenRAIL) are licenses designed to permit free and open access, re-use, and downstream distribution of derivatives of AI artifacts, for research, commercial or non-commercial purposes, as long as the use restrictions present in the license always apply (including to derivative works). For more information, please access the RAIL Initiative [post](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses).
20+
21+
## Why Open?
22+
The term “Open” is to indicate that the license enables royalty free access, downstream use and re-distribution of the licensed material, and distribution of any derivatives of it.
23+
24+
To be clearer, you can use any type of license or legal agreement you want to re-distribute the model or derivatives of it. This will be possible under the sole conditions of:
25+
Embedding the license clauses related to the responsible use of the model
26+
And, that your legal agreement is consistent with the license clauses (i.e. Section 5, Attachment A).
27+
28+
See FAQ below: “Can you give me an example?”.
29+
30+
## Responsible?
31+
Responsible AI licensing is a mechanism that is part of -and interacting with- a broader system of AI governance instruments and processes, such as Model Cards and regulations.
32+
33+
OpenRAIL licenses are designed to promote responsible downstream use and distribution of the model by including a set of use restrictions for which the model cannot be used. In the case of the CodeML OpenRAIL-M, the restrictions are mainly inspired by BigScience’s approach to the licensing of LLMs, and also include specific use cases where we believe code generation models could be used as a harmful instrument due to either the intent of the user or the technical limitations of the model.
34+
35+
## What does the “M” stand for?
36+
“M” stands for “Model”, which is the artifact being licensed under this license and subject to use restrictions. See the naming conventions for RAIL Licenses [post](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses).
37+
38+
## Is this an Open Source license?
39+
This is not an open source license according to the [Open Source Initiative](https://opensource.org/osd) definition, because it has some restrictions on the use of the model. That said, it does not impose any restrictions on reuse, distribution, commercialization, or adaptation as long as the model is not being applied to use cases that have been restricted.
40+
41+
## What are we licensing?
42+
We are licensing ML models developed under the BigCode project. Any source code relevant to the BigCode ML models is licensed under the Apache 2.0 license.
43+
44+
## Why should BigCode decide what is appropriate or not regarding the use of the model?
45+
As creators of the models, we think about how our work could be used. We believe we should do as much as we can to prevent possible harms from our work while releasing it on an open basis, especially if there are possible use cases that are incompatible or inappropriate with the model performance and/or capabilities of it.
46+
47+
## Do use restrictions apply to Derivatives of the Model?
48+
Yes. The use restrictions in the RAIL licenses have been designed to be applicable in downstream licensing terms of any derivative versions of any of the BigCode models offered and/or released by a downstream user. The distribution and use of any Derivatives of the Model (as defined in the license) should be governed by -at minimum- the same use restrictions.
49+
50+
## Can you give me an example?
51+
Imagine a company that wants to use a BigCode model in order to develop a version of a coding assistant. The company accesses the model, modifies it, and finetunes it to be the technical backbone of the coding assistant app. Firstly, these actions will be governed by the RAIL license. Secondly, and worth noting, according to the terms defined in our RAIL License, this coding assistant app is considered a Derivative of the Model. Thus, the use of the app will be governed by the use restrictions defined in the RAIL license, and accordingly, when commercializing the new version of the Model by means of a commercial license (or any other type of legal agreement), the latter will have to integrate these use restrictions as part of the subsequent license and/or Terms of Use of the specific API-based coding assistant app.
52+
53+
## Does the license cover every harmful use case?
54+
No. We recognize that the list of use restrictions does not conceivably represent “everything” one could possibly do with our work. We focus on use cases that can be a source of concern.
55+
56+
## Is it possible for the licensor to remotely restrict the use of the models? If so, what does it mean?
57+
The models itself has no built-in mechanism for it to be restricted. However, if the model is hosted via an API, restricting access remotely can be possible as the API access key can be revoked.
58+
59+
## Do I have to disclaim that the outputed code snippets were generated by a model?
60+
According to restriction (f) you cannot use the model to generate code or distribute code without intelligibly disclaiming that the code was generated by the model. What does this mean in practice? If you finetune a BigCode model, embed it into an app designed to be a code assistant, and plan to distribute the app as a SaaS, you should make it clear for users of your app that the code generated by it is generated by a model. It might sound obvious for you, but not for others, it is important to be transparent with your users.
61+
62+
## What has been modified from the BigScience OpenRAIL-M license?
63+
There are 4 modifications:
64+
65+
- The Preamble has been adapted to code generation models.
66+
- Complementary material (source code) is not licensed under this license but separately under an Apache 2.0 license.
67+
- Clause 7 of the license no longer requires users to undertake reasonable efforts to use the last updated version of the Model, as is the case in the BigScience OpenRAIL-M license.
68+
- Attachment A includes a new restriction: Restriction (c) forbidding the use of the Model to generate and/or disseminate malware. We understand that “malware”, according to the NIST definition, already includes the intent of harm. For example, using a dataset composed of source code, signature, or Indicators of Compromise (IOC) that are known to be malicious and finetune a BigCode model with it in order to enable the automated generation and/or distribution of malware or related code.
69+
70+
## What if I do not understand some of the restrictions in Attachment A of the license?
71+
Community feedback is essential for us, we are grateful to receive it and answer any questions related to the license. Drafting use restrictions that everyone agrees on and understands for ML models is a challenge. There is a balance to be struck between restrictions being too generic, on the one hand, and restrictions being too narrow and not covering potentially harmful scenarios, on the other hand. We are open to your comments.
72+
73+
## What other RAILs are out there?
74+
Since the [release](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) of the BLOOM RAIL [license](https://huggingface.co/spaces/bigscience/license) in May 2022, there has been a proliferation of RAILs based on either the structure and content of the latter or just the aim of promoting responsible use of ML models, as Meta does with its licenses for [OPT-175](https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/MODEL_LICENSE.md), [BB3](https://github.com/facebookresearch/ParlAI/blob/main/parlai/zoo/bb3/MODEL_LICENSE.md), [SEER](https://github.com/facebookresearch/vissl/blob/main/projects/SEER/MODEL_LICENSE.md). Other RAIL licenses based on the BLOOM RAIL have been released, such as [BigScience OpenRAIL-M](https://www.licenses.ai/blog/2022/8/26/bigscience-open-rail-m-license), [CreativeML OpenRAIL-M](https://huggingface.co/spaces/CompVis/stable-diffusion-license), [SIL RAIL-M 1.0](https://huggingface.co/spaces/sil-ai/model-license), and [OpenRAIL++M](https://www.ykilcher.com/license).
75+
76+
## Is there more information?
77+
Yes! Please have a look at the RAIL Initiative [FAQ](https://www.licenses.ai/faq-2).
78+
79+
# Other governance considerations related to the use of the model
80+
81+
## What if the model outputs code snippets belonging to an already existing repository under a permissive license? What about providing attribution and license notice?
82+
83+
Besides the opt out tool and process we developed for the community (“[Am I in the Stack?](https://huggingface.co/spaces/bigcode/in-the-stack)”), we are currently working on tools enabling users to identify whether the code generated by the model belongs to a code repository, in order for the user to be able to inspect licensing requirements and comply with them when using and distributing the code.
84+
85+

0 commit comments

Comments
 (0)