Skip to content
Snippets Groups Projects
Commit 953530fc authored by Jeff Wu's avatar Jeff Wu
Browse files

update readme with usage caveats and calls for research

This write-up was loosely inspired in part by Mitchell et al.’s work on
[Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993).
Adding such model usage sections could be good practice in general for
open source research projects with potentially broad applications.
parent ed0dedcd
No related branches found
No related tags found
No related merge requests found
...@@ -6,6 +6,22 @@ For now, we have only released a smaller (117M parameter) version of GPT-2. ...@@ -6,6 +6,22 @@ For now, we have only released a smaller (117M parameter) version of GPT-2.
See more details in our [blog post](https://blog.openai.com/better-language-models/). See more details in our [blog post](https://blog.openai.com/better-language-models/).
## Usage
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2-117M. While GPT-2-117M is less proficient than GPT-2-1.5B, it is useful for a wide range of research and applications which could also apply to larger models.
### Some caveats
- GPT-2-117M robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2-117M for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
- The dataset our GPT-2-117M was trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2-117M is likely to be biased and inaccurate as well.
- To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.
### Work with us
Please [let us know](mailto:languagequestions@openai.com) if you’re doing interesting research with or working on applications of GPT-2-117M! We’re especially interested in hearing from and potentially working with those who are studying
- Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
- The extent of problematic content (e.g. bias) being baked into the models and effective mitigations
## Installation ## Installation
Git clone this repository, and `cd` into directory for remaining commands Git clone this repository, and `cd` into directory for remaining commands
...@@ -53,7 +69,7 @@ and a valid install of [nvidia-docker 2.0](https://github.com/nvidia/nvidia-dock ...@@ -53,7 +69,7 @@ and a valid install of [nvidia-docker 2.0](https://github.com/nvidia/nvidia-dock
docker run --runtime=nvidia -it gpt-2 bash docker run --runtime=nvidia -it gpt-2 bash
``` ```
## Usage ## Sampling scripts
| WARNING: Samples are unfiltered and may contain offensive content. | | WARNING: Samples are unfiltered and may contain offensive content. |
| --- | | --- |
...@@ -120,4 +136,4 @@ We are still considering release of the larger models. ...@@ -120,4 +136,4 @@ We are still considering release of the larger models.
## License ## License
MIT [MIT](./LICENSE)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment