feat: add FAQ based on faqtory (#1283)

This commit is contained in:
Hervé BREDIN
2023-03-16 15:04:26 +01:00
committed by GitHub
parent b56add2e20
commit 39a261a0eb
12 changed files with 174 additions and 12 deletions
+20
View File
@@ -0,0 +1,20 @@
# Frequently Asked Questions
{%- for question in questions %}
- [{{ question.title }}](#{{ question.slug }})
{%- endfor %}
{%- for question in questions %}
<a name="{{ question.slug }}"></a>
## {{ question.title }}
{{ question.body }}
{%- endfor %}
<hr>
Generated by [FAQtory](https://github.com/willmcgugan/faqtory)
+20
View File
@@ -0,0 +1,20 @@
{%- if questions -%}
{% if questions|length == 1 %}
We found the following entry in the [FAQ]({{ faq_url }}) which you may find helpful:
{%- else %}
We found the following entries in the [FAQ]({{ faq_url }}) which you may find helpful:
{%- endif %}
{% for question in questions %}
- [{{ question.title }}]({{ faq_url }}#{{ question.slug }})
{%- endfor %}
Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.
{%- else -%}
Thank you for your issue. Give us a little time to review it.
PS. You might want to check the [FAQ]({{ faq_url }}) if you haven't done so already.
{%- endif %}
This is an automated reply, generated by [FAQtory](https://github.com/willmcgugan/faqtory)
+27
View File
@@ -0,0 +1,27 @@
name: issues
on:
issues:
types: [opened]
jobs:
add-comment:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- uses: actions/checkout@v3
with:
ref: main
- name: Install FAQtory
run: pip install FAQtory
- name: Run Suggest
run: faqtory suggest "${{ github.event.issue.title }}" > suggest.md
- name: Read suggest.md
id: suggest
uses: juliangruber/read-file-action@v1
with:
path: ./suggest.md
- name: Suggest FAQ
uses: peter-evans/create-or-update-comment@a35cf36e5301d70b76f316e867e7788a55a31dae
with:
issue-number: ${{ github.event.issue.number }}
body: ${{ steps.suggest.outputs.content }}
+35 -6
View File
@@ -1,11 +1,17 @@
# Frequently asked questions
## How does one capitalize and pronounce the name of this awesome library?
# Frequently Asked Questions
- [Can I apply pretrained pipelines on audio already loaded in memory?](#can-i-apply-pretrained-pipelines-on-audio-already-loaded-in-memory)
- [Can I use gated models (and pipelines) offline?](#can-i-use-gated-models-(and-pipelines)-offline)
- [Does pyannote support streaming speaker diarization?](#does-pyannote-support-streaming-speaker-diarization)
- [How can I improve performance?](#how-can-i-improve-performance)
- [How does one spell and pronounce pyannote.audio?](#how-does-one-spell-and-pronounce-pyannoteaudio)
📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy). Not `PyAnnote` nor `PyAnnotate` (*sic*).
📢 [Pronounced](https://www.howtopronounce.com/french/pianote) like the french verb *pianoter*. *pi* like in **pi**ano, not *py* like in **py**thon.
🎹 *pianoter* means *to play the piano* (hence the logo 🤯).
<a name="can-i-apply-pretrained-pipelines-on-audio-already-loaded-in-memory"></a>
## Can I apply pretrained pipelines on audio already loaded in memory?
Yes: read [this tutorial](tutorials/applying_a_pipeline.ipynb) until the end.
<a name="can-i-use-gated-models-(and-pipelines)-offline"></a>
## Can I use gated models (and pipelines) offline?
**Short answer**: yes, see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
@@ -16,10 +22,33 @@ For instance, before gating `pyannote/speaker-diarization`, I had no idea that s
That being said, this whole authentication process does not prevent you from using official `pyannote.audio` models offline (i.e. without going through the authentication process in every `docker run ...` or whatever you are using in production): see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
## **[Pretrained pipelines](https://huggingface.co/models?other=pyannote-audio-pipeline) do not produce good results on my data. What can I do?**
<a name="does-pyannote-support-streaming-speaker-diarization"></a>
## Does pyannote support streaming speaker diarization?
**Short answer:** not out of the box, no.
**Long answer:** [I](https://herve.niderb.fr) am looking for sponsors to add this feature. In the meantime, [`diart`](https://github.com/juanmc2005/StreamingSpeakerDiarization) is the closest you can get from a streaming `pyannote.audio`. You might also be interested in [this blog post](https://herve.niderb.fr/fastpages/2021/08/05/Streaming-voice-activity-detection-with-pyannote.html) about streaming voice activity detection based on `pyannote.audio`.
<a name="how-can-i-improve-performance"></a>
## How can I improve performance?
**Long answer:**
1. Manually annotate dozens of conversations as precisely as possible.
2. Separate them into train (80%), development (10%) and test (10%) subsets.
3. Setup the data for use with [`pyannote.database`](https://github.com/pyannote/pyannote-database#speaker-diarization).
4. Follow [this recipe](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/adapting_pretrained_pipeline.ipynb).
5. Enjoy.
**Also:** [I am available](https://herve.niderb.fr) for contracting to help you with that.
<a name="how-does-one-spell-and-pronounce-pyannoteaudio"></a>
## How does one spell and pronounce pyannote.audio?
📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy). Not `PyAnnote` nor `PyAnnotate` (sic).
📢 Pronounced like the french verb `pianoter`. `pi` like in `pi`ano, not `py` like in `py`thon.
🎹 `pianoter` means to play the piano (hence the logo 🤯).
<hr>
Generated by [FAQtory](https://github.com/willmcgugan/faqtory)
+1 -6
View File
@@ -60,6 +60,7 @@ pip install pyannote.audio
## Documentation
- [Changelog](CHANGELOG.md)
- [Frequently asked questions](FAQ.md)
- Models
- Available tasks explained
- [Applying a pretrained model](tutorials/applying_a_model.ipynb)
@@ -84,12 +85,6 @@ pip install pyannote.audio
- [Speaker verification](tutorials/speaker_verification.ipynb)
- Visualization and debugging
## Frequently asked questions
* [How does one capitalize and pronounce the name of this awesome library?](FAQ.md)
* [Can I use gated models (and pipelines) offline?](FAQ.md)
* [Pretrained pipelines do not produce good results on my data. What can I do?](FAQ.md)
## Benchmark
Out of the box, `pyannote.audio` default speaker diarization [pipeline](https://hf.co/pyannote/speaker-diarization) is expected to be much better (and faster) in v2.x than in v1.1. Those numbers are diarization error rates (in %)
+7
View File
@@ -0,0 +1,7 @@
# FAQtory settings
faq_url: "https://github.com/pyannote/pyannote-audio/blob/develop/FAQ.md" # Replace this with the URL to your FAQ.md!
questions_path: "./questions" # Where questions should be stored
output_path: "./FAQ.md" # Where FAQ.md should be generated
templates_path: ".faq" # Path to templates
+6
View File
@@ -0,0 +1,6 @@
# Questions
Your questions should go in this directory.
Question files should be named with the extension ".question.md".
+16
View File
@@ -0,0 +1,16 @@
---
title: "How can I improve performance?"
alt_titles:
- "Pretrained pipelines do not produce good results on my data. What can I do?"
- "It does not work! Help me!"
---
**Long answer:**
1. Manually annotate dozens of conversations as precisely as possible.
2. Separate them into train (80%), development (10%) and test (10%) subsets.
3. Setup the data for use with [`pyannote.database`](https://github.com/pyannote/pyannote-database#speaker-diarization).
4. Follow [this recipe](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/adapting_pretrained_pipeline.ipynb).
5. Enjoy.
**Also:** [I am available](https://herve.niderb.fr) for contracting to help you with that.
+7
View File
@@ -0,0 +1,7 @@
---
title: "Can I apply pretrained pipelines on audio already loaded in memory?"
alt_titles:
- "Can I apply models on an audio array?"
---
Yes: read [this tutorial](tutorials/applying_a_pipeline.ipynb) until the end.
+15
View File
@@ -0,0 +1,15 @@
---
title: "Can I use gated models (and pipelines) offline?"
alt_titles:
- "Why does one need to authenticate to access the pretrained models?"
- "Can I use pyannote.audio pretrained pipelines without the Hugginface token?"
- "How can I solve the permission issue?"
---
**Short answer**: yes, see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
**Long answer**: gating models and pipelines allows [me](https://herve.niderb.fr) to know a bit more about `pyannote.audio` user base and eventually help me write grant proposals to make `pyannote.audio` even better. So, please fill gating forms as precisely as possible.
For instance, before gating `pyannote/speaker-diarization`, I had no idea that so many people were relying on it in production. Hint: sponsors are more than welcome! Maintaining open source libraries is time consuming.
That being said, this whole authentication process does not prevent you from using official `pyannote.audio` models offline (i.e. without going through the authentication process in every `docker run ...` or whatever you are using in production): see [this tutorial](tutorials/applying_a_model.ipynb) for models and [that one](tutorials/applying_a_pipeline.ipynb) for pipelines.
+10
View File
@@ -0,0 +1,10 @@
---
title: "How does one spell and pronounce pyannote.audio?"
alt_titles:
- "Why the name of the library?"
- "Why the logo of the library?"
---
📝 Written in lower case: `pyannote.audio` (or `pyannote` if you are lazy). Not `PyAnnote` nor `PyAnnotate` (sic).
📢 Pronounced like the french verb `pianoter`. `pi` like in `pi`ano, not `py` like in `py`thon.
🎹 `pianoter` means to play the piano (hence the logo 🤯).
+10
View File
@@ -0,0 +1,10 @@
---
title: "Does pyannote support streaming speaker diarization?"
alt_titles:
- "Is it possible to do realtime speaker diarization?"
- "Can it process online audio buffers?"
---
**Short answer:** not out of the box, no.
**Long answer:** [I](https://herve.niderb.fr) am looking for sponsors to add this feature. In the meantime, [`diart`](https://github.com/juanmc2005/StreamingSpeakerDiarization) is the closest you can get from a streaming `pyannote.audio`. You might also be interested in [this blog post](https://herve.niderb.fr/fastpages/2021/08/05/Streaming-voice-activity-detection-with-pyannote.html) about streaming voice activity detection based on `pyannote.audio`.