OpenAssistant

OpenAssistant is an artificial intelligence (AI) open source chat-based assistant that understands tasks, can interact with third-party systems and retrieve information dynamically to do so.[1][2] The project is developed by a group of volunteers in collaboration with LAION. One of the goals for development includes free access to large language models that can be run locally on consumer hardware.[3]

OpenAssistant
Developer(s)LAION and contributors
Initial release15 April 2023 (2023-04-15)
Type
LicenseApache License 2.0
Websiteopen-assistant.io

The project is backed by a worldwide crowdsourcing effort involving over 13,500 volunteers who have created 600k human-generated data points.[2][4][5][6]

Development

Development Roadmap

OpenAssistant developers were attempting to get an initial MVP by following the three steps outlined in the InstructGPT paper.[7]

  1. Collecting high-quality human generated Instruction-Fulfillment samples (prompt + response), the goal being to be greater than 50,000 such samples. Then, designing a crowdsourced process to collect and reviewed prompts. To avoid training on flooding/toxic/spam/junk/personal information data, the developers have a leaderboard to motivate the volunteering community that shows progress and the most active users.
  2. Sampling multiple completions for each of the collected prompts. Completions of one prompt are then shown randomly to users to rank them from best to worst. Multiple votes by independent users have to be collected to measure the overall agreement. The gathered ranking-data is then to be used to train a reward model.
  3. Following the RLHF training phase based on the prompts and the reward model.

The resulting model is then to be obtained and continued with the completion sampling step, i.e. the second step above for a next iteration.[8]

Development Status

On March 10 2023, extremely early models of OpenAssistant have begun generating responses to training prompts on the OpenAssistant website. These responses were open for ranking for step two of the InstructGPT paper above. This data is to be fed into the training database. The models are specifically iterations of pythia-6.9B-deduped models.

On April 15, 2023, OpenAssistant was released to public.[4]

As of 11 May 2023, Open Assistant suports 40 Languages, including Catalan, Bavarian, Esperanto and Basque.

References

  1. Open-Assistant, LAION AI, 2023-03-09, retrieved 2023-03-09
  2. Köpf, Andreas; Kilcher, Yannic; von Rütte, Dimitri; Anagnostidis, Sotiris; Tam, Zhi-Rui; Stevens, Keith; Barhoum, Abdullah; Duc, Nguyen Minh; Stanley, Oliver; Nagyfi, Richárd; ES, Shahul; Suri, Sameer; Glushkov, David; Dantuluri, Arnav; Maguire, Andrew (2023-04-14). "OpenAssistant Conversations -- Democratizing Large Language Model Alignment". arXiv:2304.07327 [cs.CL].
  3. Open-Assistant, LAION AI, 2023-03-09, retrieved 2023-03-09
  4. "OpenAssistant RELEASED! The world's best open-source Chat AI! | Open Assistant". laion-ai.github.io. 2023-04-15. Retrieved 2023-05-05.
  5. "Open Assistant: Explore the Possibilities of Open and Collaborative Chatbot Development". KDnuggets. Retrieved 2023-05-05.
  6. Shenwai, Dhanshree Shripad (2023-04-21). "Meet OpenAssistant: An open-source chat model That consists of a ~161K human-generated, human-annotated assistant-style conversation corpus, including 35 different languages". MarkTechPost. Retrieved 2023-05-05.
  7. Ouyang, Long; Wu, Jeff; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll L.; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini; Slama, Katarina; Ray, Alex; Schulman, John; Hilton, Jacob; Kelton, Fraser; Miller, Luke; Simens, Maddie (2022-03-04). "Training language models to follow instructions with human feedback". arXiv:2203.02155 [cs.CL].
  8. Open-Assistant, LAION AI, 2023-03-09, retrieved 2023-03-09
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.