Skip to content

Term View

Cuda

Entries linked to "Cuda" across the quiet archive.

Entries 2
Categories 23
Tags 187

2 entries on this page

OpenAI Whisper Speech Recognition Guide

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. GitHub Repository Installation pip install git+https://github.com/openai/whisper.git Fix CUDA not detecting GPU Whisper will default to the CPU if a GPU is not detected, which is considerably slower. pip uninstall torch pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 Example usage # Transcribe whisper input.mp3 --model medium.en --language en --task transcribe # Translate whisper japanese.wav --model large --language Japanese --task translate Available models and languages There are five model sizes, four with English-only versions, offering speed and accuracy tradeoffs. Below are the names of the available models and their approximate memory requirements and relative speed.

OpenAI Whisper Speech Recognition Guide

MiDaS Depth Estimation Guide

GitHub Repository During installation, I ran into an issue where the CUDA package wasn’t found. Had to modify environment.yaml to: name: midas-py310 channels: - pytorch - defaults dependencies: - nvidia::cuda-toolkit=11.7.0 - python=3.10.8 - pytorch::pytorch=1.13.0 - torchvision=0.14.0 - pip=22.3.1 - numpy=1.23.4 - pip: - opencv-python==4.6.0.66 - imutils==0.5.4 - timm==0.6.12 - einops==0.6.0 Commands that were helpful for troubleshooting CUDA:

MiDaS Depth Estimation Guide

Browse Routes

Adjacent collections

  1. Posts108
  2. Prompts2
  3. Archive16
  4. Categories23
  5. Tags187

Discovery Layer

Memory Field

A secondary exploration surface for following relationships beyond the visible ledger.

Categories 0
Tags 0
Entries 0