各位大佬 AMD的卡能跑本地大模型吗

459413498 发表于 2025-1-16 09:11

比如qwen deepseek klingAI这种

目前用的1070ti跑个实时翻译都能把视频给卡了

价格合适的话想换9070XT

ainomelody 发表于 2025-1-16 09:19

现在搞个wsl应该可以了，之前还得上linux跑

Demir 发表于 2025-1-16 09:23

A卡还不如I卡呢。

炼金术士 发表于 2025-1-16 09:31

很不幸，不太能
生产力相关老老实实上n卡

BFG9K 发表于 2025-1-16 09:32

买张4060Ti 16G吧

Leciel 发表于 2025-1-16 09:38

7900xtx, 64G ram, amd 5900x@65w

N:\>ollama run phi4:latest --verbose
>>> briefly explain what is quantum physics
Quantum physics, also known as quantum mechanics, is a fundamental theory in physics that describes the behavior
of matter and energy at the smallest scales—such as atoms and subatomic particles like electrons and photons. It
departs from classical physics by introducing concepts such as quantization, wave-particle duality, superposition,
entanglement, and uncertainty.

1. **Quantization**: Many physical quantities, like energy, come in discrete units called "quanta." This means
that these quantities can only take on specific, fixed values rather than a continuous range.

2. **Wave-Particle Duality**: Particles such as electrons exhibit both wave-like and particle-like properties. For
example, they can interfere with themselves like waves while also being detected as discrete particles.

3. **Superposition**: A quantum system can exist in multiple states or configurations simultaneously until it is
measured or observed. Once a measurement occurs, the system 'collapses' to one of the possible states.

4. **Entanglement**: Particles can become entangled such that the state of one particle is directly related to the
state of another, no matter how far apart they are. Changes to the state of one entangled particle instantaneously
affect its partner, a phenomenon Albert Einstein famously referred to as "spooky action at a distance."

5. **Uncertainty Principle**: Formulated by Werner Heisenberg, this principle states that certain pairs of
properties (like position and momentum) cannot be simultaneously measured with arbitrary precision. The more
precisely one property is known, the less precisely the other can be determined.

Quantum physics has been incredibly successful in explaining a vast range of phenomena and forms the foundation
for many modern technologies, including semiconductors, lasers, and quantum computing. Despite its
counterintuitive nature, it remains one of the most well-validated theories in science.

total duration:    6.3898724s
load duration:    12.3162ms
prompt eval count: 17 token(s)
prompt eval duration: 118ms
prompt eval rate: 144.07 tokens/s
eval count:       373 token(s)
eval duration:    6.254s
eval rate:          59.64 tokens/s

StevenG 发表于 2025-1-16 09:41

能跑，建议7900xtx，显存更大，而且9070xt大概率5000+，二者最终价差应该就是大几百块，但是多出的显存能让你跑的模型上一个台阶。

悠悠lyz 发表于 2025-1-16 10:05

7900XTX单卡能跑Qwen 70B
推理可以，四五十token/s

Leciel 发表于 2025-1-16 10:13

悠悠lyz 发表于 2025-1-16 10:05
7900XTX单卡能跑Qwen 70B
推理可以，四五十token/s

我这最多就跑个32B的模型。你的单卡70B是怎么部署的？

运行BF16或FP16模型需要多卡至少144GB显存（例如2xA100-80G或5xV100-32G）；运行Int4模型至少需要48GB显存（例如1xA100-80G或2xV100-32G）。

ganxy 发表于 2025-1-16 10:17

应该能跑推理，毕竟很多人甚至还买mac mini 来跑呢

装陈醋的酱油瓶 发表于 2025-1-16 10:19

实时翻译用的是啥?

yangzi123aaa20 发表于 2025-1-16 10:23

能跑的人不会问，会问的人跑不了[偷笑]

xyzhangabc 发表于 2025-1-16 11:11

LLM 只推理不训练的话，window下装个ollama几乎都是一键部署了，随便跑。

wun_008 发表于 2025-1-16 11:15

本帖最后由 wun_008 于 2025-1-16 11:20 编辑

我用两个 p102 用ollama 跑了个千问2.5 32b 一秒3个字，已经可以当个人办公助手了
ollama 现在可以叠加显存还能拿内存来帮忙，非常牛逼啊我一共才投资了 6 7百块
amd 主要用来打游戏

gdens84 发表于 2025-1-16 11:25

实测7800xt推理速度和4070ti一样（linux下）。推理的话不建议上9070xt，一个是推理基本只看显存大小和带宽，一个是amd rocm对新卡支持会很滞后，要跑买个7900xt/7900xtx就行

lavin 发表于 2025-1-16 11:25

DEEPSEEK 需要显存不少

我輩樹である 发表于 2025-1-16 11:29

你不追新就可以，hugface里面有rocm支持的可以现拉。如果你要最新的，新特性amd基本都不会第一时间支持，而且需要你自己集成，看你的能力了。

459413498 发表于 2025-1-16 12:12

装陈醋的酱油瓶发表于 2025-1-16 10:19
实时翻译用的是啥?

**支持whisper的插件从视频提取音轨然后转成日文或者英文字幕再加上本地部署的模型设置好prompt翻译一下就是每句中文

zxy2001 发表于 2025-1-17 00:32

459413498 发表于 2025-1-16 12:12
**支持whisper的插件从视频提取音轨然后转成日文或者英文字幕再加上本地部署的模型设置好prompt翻译 ...

高级玩法。。。看美剧不用等字幕组了

FelixIvory 发表于 2025-1-17 02:27

换hp z2 mini g1a 工作站。
strix halo 128g内存，最大96g显存。

自挂东南枝 发表于 2025-1-17 02:31

可以，7900xtx的24G显存性价比很高，直接用koboldcpp，可以用vulkan或者rocm跑。

悠悠lyz 发表于 2025-2-11 21:09

Leciel 发表于 2025-1-16 10:13
我这最多就跑个32B的模型。你的单卡70B是怎么部署的？

需要确认.

页: [1]

Chiphell - 分享与交流用户体验's Archiver

各位大佬 AMD的卡能跑本地大模型吗