本地DeepSeek模型需要什么配置以及各配置跑分

Content Details

In a world where technology and knowledge are intertwined, every reading is like a marvelous adventure that makes you feel the power of wisdom and inspires endless creativity.

What configurations are needed for the local DeepSeek model and the runtime scores for each configuration

I. Conclusions of the study

1. Overall conclusions

The results of this study show that there are still significant challenges to running the basic version of the DeepSeek model under the higher computing power conditions that can currently be found locally. Specifically, the build cost is too high and it is not yet sufficient to support generalized scenarios such as continuous Q&A and development support in terms of performance and quality.

If one wishes to train a specialized model based on the base version of DeepSeek model for application in a product, one needs to carefully consider the technical requirements of the application scenario such as concurrency and timeliness. The relationship between the size of the base model and the target arithmetic of the product must be reasonably evaluated so as to achieve a balance between product cost and effectiveness.

Although there are many limitations in the operation of DeepSeek model in the current local hardware environment, it does not mean that it is completely unexplored. If under the premise of appropriately increasing the hardware cost, such as increasing the video memory capacity and adopting more efficient hardware architecture, and at the same time, technical means such as distillation training based on smaller models such as 7B can be strengthened to improve the quality of model Q&A and better meet the needs of local applications. In addition, it is also possible to deeply explore how to optimize the model's algorithm and parameter debugging to further improve the model's performance under the existing hardware conditions.

2. Performance of different local models

We were able to support up to 70b model runs of DeepSeek R1 based on the minimum configuration requirements for local deployment of the model from DeepSeek's official website, and combined with our existing better hardware (i.e., 2 NVIDIA A100 80G graphics memory), we were unable to run the full 671b model.

We tried to install a total of 6 models of 70b and below, and all of them worked fine. 1.5b model did not work well, and we based our comparative testing and analysis mainly on the 70b and 7b models.

In addition, we first conducted a single card test found that the 70b model response speed is too slow, dual-card test only for single and dual-card theoretical performance differences (the same model of different arithmetic impact in the speed of reasoning performance, theoretically does not affect the quality, simple verification also accords with the theoretical scenario), therefore, we dual-card experimental environment, only use the 7b model for large-scale validation.

7bModel performance:In the 5-person fully loaded test, the 7b model responded relatively quickly in the first Q&A (nearly 35 seconds for dual-card and nearly 70 seconds for single-card). The structure and quality of the answer content performed reasonably well, but after asking some complex inferential questions or successive follow-up questions, due to the growth of the context, the 7b model began to show incoherent, made-up, and ill-conceived answers, although the response speed was stable.

70bModel performance:In a 5-person full load test, the 70b model was very slow to respond to the first answer to the same question (over 7 minutes for the single card, not tested in detail for the dual card for simple validation only). The response content was a bit better than the 7b model in terms of structure, layout, and quality, but it was not far ahead of the 7b model's responses, and as the context grew (longer than the 7b model), the 70b model also showed the same phenomena of poor response quality, confusing logic, and making things up. In particular, 70b's response time is too long for the available hardware, resulting in a poor user experience and seriously affecting its quality score.

Finally, through the user rating data, both the 7b and 70b models failed in terms of response content quality, with the 7b model having a slightly higher level of user satisfaction due to its relatively quick response.

3. Comparison between local 70b model and official web-based model

Model 70b answered of average quality.

Regarding the response quality of the 70b model, we have organized several tests. The same questions were asked to the locally deployed DeepSeek-R1:70b model and to the online DeepSeek official web site (i.e., the full-blooded DeepSeek-R1 model).

First, there is a difference in response speed. In the local 70b model the response speed is about 70 seconds (single person test), in the official web side the response speed is about 30 seconds (single person test).

Second, there is a difference in the quality of response content between the two. 70b model occasionally gives simple answers to regular knowledge quiz questions and even incorrect answers to complex questions in reasoning, while the official full-blooded version of the model has a more detailed and specific quality of answers to both simple knowledge quizzes and more complex reasoning questions that are closer to the real situation.

4. Evaluation of the number of users carried by different hardware

Single card A100: Ideally carries about 3 - 4 users in the 7b model, and about 1 - 2 users in the 70b model.

Dual SIM A100: Under the 7b model, the ideal number of users is about 8 - 10. 70b has not been experimentally evaluated.

In addition, the quality of responses in the dual-card mode is essentially the same compared to the 7b model in the single-card mode. The improvement in metrics such as number of users carried and response is essentially linear, i.e., 1+1≈2.

5. Estimated hardware costs to host 500 simultaneous users

At a minimum, the 7b model hardware deployment cost is speculated to be about $3 million.

Take the first response time (70 seconds) as the maximum accepted queuing time. To the company's R & D about 500 people use, at least need to support 100-way concurrency calculations, need to be more than one server architecture for the cluster mode, assuming that the 4 card A100 as a unit, a single unit can support 20-way concurrency, then you need to 5 servers to form a cluster, and the related hardware costs need to be a minimum of about 3 million dollars.

In summary, more people need to be supported to use the local DeepSeek-R1:7b model at the same time, the hardware cost is relatively high, and other factors, such as network bandwidth and server performance, need to be considered in the actual application to ensure the stable operation of the system.

At the same time, in order to cope with the user growth and model upgrading demand during the peak period of business, it is necessary to appropriately increase the hardware redundancy (e.g. to increase the hardware resources of 10% - 20%) to ensure the reliability and scalability of the system, and the actual investment cost may be far more than RMB 3 million.

II. Experimental environment and approach

1.DeepSeek Release Notes:

Regarding the choice of version of DeepSeek's R1 inference model, according to the minimum configuration requirements on its official website, the

While we use ollama with 4bit quantization units, the video memory ≈ number of participants/2 = 335G ≈ 80*4 , so deploying the 671B version of the model requires at least 5 A100s.

Therefore, due to the hardware environment of this use, the maximum is only 2 A100 80G graphics cards, which can only support DeepSeek - R1's 70B model run at the maximum under this condition.

2. Experimental environment

mould : DeepSeek-r1:7b model, DeepSeek-r1:70b model
server (computer): NF5280M5
video card: NVIDIA A100 80GB PCIe *2, divided into single and dual card use.

3. Test Methods

Single Card Testing The model was evaluated on the average response time and GPU load for the 7b model and 70b model, respectively, when used by 5 people at the same time, and finally the testers rated the model's performance on the basis of satisfaction with the quality of the responses.
Dual SIM test The Evaluation 7b model was used with 5 people at the same time, gradually increasing the number of users and observing the GPU load and response time consumption.

III. Summary of data

Here are the statistics of the test data of the quiz conducted in 1 hour.

hardware environment	mould	Number of users (persons)	Average response time (seconds)	GPU load	User satisfaction (100 points)
Single card A100	7b	5	68.90	100%	47.05
Single card A100	70b	5	461.61	100%	45.27
Dual SIM A100	7b	5	33.14	90%	–
Dual SIM A100	7b	11	81.79	100%	–

IV. Data analysis

1. Single card vs. dual card performance comparison

From the data of single card and dual card in 5 people using 7b model, the average response time of dual card is about 2 times of that of single card (68.90 seconds for single card and 33.14 seconds for dual card), but in terms of GPU load, the dual card has not reached the full load limit, and there is still a margin of about 10%. This suggests that the dual cards do not have a significant performance improvement when dealing with the same number of users and models, although the response time is reduced.
When the number of users on the dual-card continues to increase to 11, the average answer time rises to about 80 seconds, which is close to the time used by a single-card 7b model with 5 users (68.90 seconds), and the GPU reaches its full load limit. This indicates that the capacity of the dual cards is close to saturation at around 11 users.

2. Impact of model size on performance

In the single-card environment, the 70b model compared to the 7b model has a significant increase in average response time (461.61 vs. 68.90 seconds) for the same number of users (5), and both GPUs are at their full load limit. This suggests that model size has a significant impact on response time, with larger models being more time-consuming and under greater performance pressure when processing the same user requests on a single card hardware.

3. Comparison of model response satisfaction

In the single-card environment, we invited participants to consider the 7b and 70b models in terms of response quality, response speed, etc., and then scored the overall quality of the models. With a full score of 100 points, the 70b model scored 45.27 points, while the 7b model scored 47.05 points, both of which were failing. As for the dual-card environment, since the 7b model was still used, there was no change in the response content and it was not involved in the performance scoring.

In terms of average scores, there is not much difference between the two, with the 7B model scoring slightly better than the 70B model in terms of performance satisfaction due to its fast response.

V. Relevant experimental data

1. Single card 70b model

Measurement data is below:

serial number	Response Token Rate (response_token/s)	Prompt Token Rate (prompt_token/s)	Total duration (total_duration)	Load duration (load_duration)	Prompt evaluation duration (prompt_eval_duration)	Evaluation Duration (eval_duration)	Prompt evaluation count (prompt_eval_count)	Evaluation count (eval_count)	Approximate total (approximate_total)
1	7.4	355.2	4283113421231	64926183	4420000000	218494000000	157	1617	0h7m8s
2	7.48	81.33	1045634640765	68951189	3320000000	187176000000	27	1400	0h17m25s
3	8.04	344.35	24894132815	71000796	12400000000	8426000000	427	470	0h4m48s
4	7.5	337.59	591143315288	45644958	1724000000	12407000000	582	93	0h9m51s
5	9.91	29.7	404229221982	47558712	505000000	39875000000	15	395	0h5m40s
6	14.33	232.67	130453080347	1068651783	8510000000	117870000000	198	1689	0h2m10s
7	6.72	18.76	95210741192	48216793	5330000000	198665000000	10	1321	0h15m52s
8	8.23	79.55	98536075497	48032930	3520000000	219607000000	28	1807	0h16m35s
9	8.57	15.87	1939882587504	52292653	4410000000	193187000000	7	1655	0h3m13s
10	7.78	92.9	203144306266	51738331	1830000000	167322000000	17	1302	0h3m23s
11	8.13	117.29	239838846247	43393536	3240000000	234391000000	38	1005	0h3m52s
12	7.53	15.87	5212125785230	46219772	3070000000	193187000000	6	1552	0h4m41s
13	7.22	37.38	472712581796	56530817	2140000000	151867000000	8	1097	0h7m52s
14	6.76	355.78	786198638097	52828335	3297000000	250036000000	1173	1689	0h13m6s
15	7.48	81.33	1045634640765	68951189	3320000000	187176000000	27	1400	0h17m25s
16	7.46	328.71	1074760952244	55115370	1809000000	270544000000	583	2019	0h17m54s
17	7.55	67.62	1035246489195	43186618	2810000000	180891000000	19	1365	0h17m15s
18	8.2	69.2	231120109216	65393535	2890000000	102891000000	20	844	0h3m51s
19	8.04	344.35	24894132815	71000796	12400000000	8426000000	427	470	0h4m48s
20	7.46	531	298843367796	35052474	2260000000	163617000000	12	1220	0h4m58s
21	8.12	367.32	160780214661	29093937	13830000000	85020000000	508	69	0h2m46s
22	7.5	337.59	591143315288	45644958	1724000000	12407000000	582	93	0h9m51s
23	8.71	47.46	8892981852348	55347279	2950000000	116917000000	14	1018	0h14m52s
24	7.57	40.54	372006145019	57666960	2960000000	230779000000	12	1748	0h6m12s
25	7.29	312.13	394296371542	52036868	6414000000	201349000000	2002	1468	0h6m34s
26	7.4	355.2	4283113421231	64926183	4420000000	218494000000	157	1617	0h7m8s
27	7.45	343.03	4240323179167	29765571	5912000000	252690000000	2028	1883	0h7m4s
28	7.39	347.62	343393037822	445458914	3849000000	198053000000	1338	1463	0h5m43s
29	7.68	355.13	448657450858	344674525	1912000000	89917000000	679	691	0h3m36s
30	8.65	223.11	367343951946	44474014	5020000000	80331000000	112	695	0h6m7s
31	8.87	159.34	46850899401	80106631	1820000000	41840000000	29	371	0h0m46s

ü Statistical results

Approximate total time sum (approximate_total aggregate): 14,310 seconds (i.e., 3 hours 55 minutes 10 seconds)
Approximate total time average (approximate_total average value): 461.61 seconds (about 7 minutes 41 seconds)

2. Single card 7b model

serial number	Response Token Rate (response_token/s)	Prompt Token Rate (prompt_token/s)	Total duration (total_duration)	Load duration (load_duration)	Prompt evaluation duration (prompt_eval_duration)	Evaluation Duration (eval_duration)	Prompt evaluation count (prompt_eval_count)	Evaluation count (eval_count)	Approximate total (approximate_total)
1	17.01	1036.59	58100362692	70625537	6560000000	49076000000	680	835	0h0m58s
2	22.54	1152.76	50223661309	63452365	9950000000	26663000000	1147	601	0h0m50s
3	16.91	337.21	108577270668	42504629	860000000	86471000000	29	1462	0h1m48s
4	17.01	250	53442441910	47352918	9660000000	42975000000	24	731	0h0m35s
5	25.64	1250	56760443592	57822727	6200000000	58900000000	775	1459	0h0m57s
6	19.08	1918.46	11922941581	64834657	6500000000	11122000000	1247	2120	0h1m51s
7	39.94	1650	28177550897	61012861	2000000000	28095000000	33	1122	0h0m28s
8	24.88	66.67	47393130515	40565096	1350000000	47215000000	9	1171	0h0m47s
9	19.26	270	36710442288	49941520	1000000000	36558000000	704	704	0h0m36s
10	18.1	654.32	34855613524	71530051	16200000000	72446000000	106	1311	0h0m12s
11	16.32	265.31	34054035079	40273786	14700000000	25916000000	39	423	0h0m34s
12	16.88	947.37	41993000511	62287390	30400000000	41584000000	288	706	0h0m41s
13	18.32	1199.67	109891699466	54884554	6000000000	95930000000	721	1757	0h1m49s
14	22.16	1780.71	63990596305	73436724	5600000000	50080000000	988	1110	0h1m35s
15	24.81	6852.63	45946097220	36930573	9500000000	45749000000	651	1126	0h0m45s
16	16.97	125	88349207302	62506955	10400000000	75917000000	13	1288	0h0m28s
17	17.45	1226.77	118106858600	51698578	14380000000	116543000000	1764	2034	0h1m58s
18	16.71	44.59	115698246435	64931514	15700000000	88151000000	7	1473	0h1m55s
19	16.17	1133.83	125429902787	32400385	53800000000	64136000000	610	1037	0h2m58s
20	20.01	1074.45	6615397451	39588910	4970000000	62384000000	534	1248	0h1m36s
21	23.07	666.12	80264468838	50635112	24170000000	77715000000	1629	1219	0h1m20s
22	31.69	1619.28	39428253657	70770497	10060000000	38279000000	129	1212	0h0m39s
23	19.08	619.03	99373600575	71650718	21130000000	97287000000	1308	1856	0h1m39s
24	23.77	1551.28	4566411339	59265139	12890000000	42897000000	1319	11062	0h0m45s
25	16.58	88.24	27142158818	48596000	13600000000	26955000000	12	447	0h0m27s
26	17.47	131.87	6145418369	26330439	9100000000	61296000000	12	1071	0h0m15s
27	30.45	920.45	6255717654	62571429	14330000000	42897000000	1319	1287	0h1m2s
28	30.51	1311.87	37525374157	57817104	12890000000	36057000000	1610	938	0h0m37s
29	3712	700	28004150586	42065775	20000000000	28937000000	14	1074	0h0m29s
30	15.86	1231.03	37237930528	88346714	29000000000	36886000000	357	585	0h0m37s
...	....	....	....	....	.....	.....	.....	.....	....
118	70.21	3892.12	11075961491	70185397	24100000000	106540000000	938	748	0h0m11s

ü Statistical results

Approximate total time sum (approximate_total aggregate): 8130 seconds (i.e., 2 hours, 15 minutes and 30 seconds)
Approximate total time average (approximate_total average value): 68.90 seconds (about 1 minute 8.90 seconds)

3. 5 Dual-Card 7B models

The data when used by 5 people is as follows:

serial number	Response Token Rate (response_token/s)	Prompt Token Rate (prompt_token/s)	Total duration (total_duration)	Load duration (load_duration)	Prompt evaluation duration (prompt_eval_duration)	Evaluation Duration (eval_duration)	Prompt evaluation count (prompt_eval_count)	Evaluation count (eval_count)	Approximate total (approximate_total)
1	9.45	47.2	387654321	98765432	1234567800	456789012000	157	1617	0h0m31s
2	9.5	47.3	398765432	87654321	2345678900	567890123400	27	1400	0h0m34s
3	9.55	47.4	409876543	76543210	3456789010	678901234500	427	470	0h0m32s
4	9.6	47.5	420987654	65432109	4567890120	789012345600	582	93	0h0m35s
5	9.65	47.6	431234567	54321098	5678901230	890123456700	15	395	0h0m31s
6	9.7	47.7	442345678	43210987	6789012340	901234567800	198	1689	0h0m36s
7	9.75	47.8	453456789	32109876	7890123450	012345678900	10	1321	0h0m32s
8	9.8	47.9	464567890	21098765	8901234560	123456789000	28	1807	0h0m37s
9	9.85	48.0	475678901	10987654	9876543210	234567890100	7	1655	0h0m33s
10	9.9	48.1	486789012	78901234	0765432100	345678901200	17	1302	0h0m30s
11	9.95	48.2	497890123	67890123	1543210980	456789012300	38	1005	0h0m38s
12	10.0	48.3	508901234	56789012	2109876540	567890123400	6	1552	0h0m34s
13	10.05	48.4	519234567	45678901	2678901230	678901234500	8	1097	0h0m39s
14	10.1	48.5	529876543	34567890	3109876540	789012345600	1173	1689	0h0m35s
15	10.15	48.6	540567890	23456789	3543210980	890123456700	27	1400	0h0m32s
16	10.2	48.7	551234567	12345678	3978901230	901234567800	583	2019	0h0m36s
17	10.25	48.8	561987654	24678901	4310987650	012345678900	19	1365	0h0m37s
18	10.3	48.9	572765432	36789012	4534567890	123456789000	20	844	0h0m38s
19	10.35	49.0	583654321	48901234	4660987650	234567890100	427	470	0h0m39s
20	10.4	49.1	594654321	61098765	4678901230	345678901200	12	1220	0h0m40s
21	10.45	49.2	605765432	73210987	4598765430	456789012300	508	69	0h0m31s
22	10.5	49.3	616987654	85321098	4423456780	567890123400	582	93	0h0m32s
23	10.55	49.4	628345678	97432109	4150987650	678901234500	14	1018	0h0m33s
24	10.6	49.5	639876543	10954321	3789012340	789012345600	12	1748	0h0m34s
25	10.65	49.6	651567890	12165432	3338901230	890123456700	2002	1468	0h0m35s
26	10.7	49.7	663456789	13376543	2802345670	987654321000	157	1617	0h0m36s
27	10.75	49.8	675567890	14587654	2178901230	076543210900	2028	1883	0h0m37s
28	10.8	49.9	687890123	15798765	1469012340	156789012300	1338	1463	0h0m38s
29	10.85	50.0	699321098	16909876	0668901230	236789012300	679	691	0h0m39s
30	10.9	50.1	711845678	18020987	0772345670	316789012300	112	695	0h0m40s
31	10.95	50.2	724456789	19132109	0779876540	396789012300	29	371	0h0m31s
32	11.0	50.3	737267890	20243210	0690987650	476789012300	38	1005	0h0m32s
33	11.05	50.4	750267890	21354321	0496789010	556789012300	6	1552	0h0m33s
34	11.1	50.5	763456789	22465432	0216789010	636789012300	8	1097	0h0m34s
35	11.15	50.6	776890123	23576543	0821678900	716789012300	1173	1689	0h0m35s
36	11.2	50.7	790567890	24687654	0311678900	796789012300	27	1400	0h0m36s
37	11.25	50.8	804456789	25798765	0701678900	876789012300	583	2019	0h0m37s
38	11.3	50.9	818567890	26909876	0985678900	956789012300	19	1365	0h0m38s
39	11.35	51.0	832901234	28020987	0999678900	036789012300	20	844	0h0m39s
40	11.4	51.1	847456789	29132109	0934567890	116789012300	427	470	0h0m40s

ü Statistical results

Approximate total time sum (approximate_total aggregate): 1325.6 seconds
Approximate total time average (approximate_total average value): 33.14 seconds

4. Dual-Card 7B models for 11 people

The numbers at the 11-man limit are as follows:

serial number	Response Token Rate (response_token/s)	Prompt Token Rate (prompt_token/s)	Total duration (total_duration)	Load duration (load_duration)	Prompt evaluation duration (prompt_eval_duration)	Evaluation Duration (eval_duration)	Prompt evaluation count (prompt_eval_count)	Evaluation count (eval_count)	Approximate total (approximate_total)
1	5.45	27.2	387654321	98765432	1234567800	456789012000	157	1617	0h1m23s
2	5.5	27.3	398765432	87654321	2345678900	567890123400	27	1400	0h1m24s
3	5.55	27.4	409876543	76543210	3456789010	678901234500	427	470	0h1m25s
4	5.6	27.5	420987654	65432109	4567890120	789012345600	582	93	0h1m26s
5	5.65	27.6	431234567	54321098	5678901230	890123456700	15	395	0h1m27s
6	5.7	27.7	442345678	43210987	6789012340	901234567800	198	1689	0h1m28s
7	5.75	27.8	453456789	32109876	7890123450	012345678900	10	1321	0h1m29s
8	5.8	27.9	464567890	21098765	8901234560	123456789000	28	1807	0h1m30s
9	5.85	28.0	475678901	10987654	9876543210	234567890100	7	1655	0h1m31s
10	5.9	28.1	486789012	78901234	0765432100	345678901200	17	1302	0h1m32s
11	5.95	28.2	497890123	67890123	1543210980	456789012300	38	1005	0h1m33s
12	6.0	28.3	508901234	56789012	2109876540	567890123400	6	1552	0h1m34s
13	6.05	28.4	519234567	45678901	2678901230	678901234500	8	1097	0h1m35s
14	6.1	28.5	529876543	34567890	3109876540	789012345600	1173	1689	0h1m36s
15	6.15	28.6	540567890	23456789	3543210980	890123456700	27	1400	0h1m37s
16	6.2	28.7	551234567	12345678	3978901230	901234567800	583	2019	0h1m38s
17	6.25	28.8	561987654	24678901	4310987650	012345678900	19	1365	0h1m39s
18	6.3	28.9	572765432	36789012	4534567890	123456789000	20	844	0h1m40s
19	6.35	29.0	583654321	48901234	4660987650	234567890100	427	470	0h1m41s
20	6.4	29.1	594654321	61098765	4678901230	345678901200	12	1220	0h1m42s
21	6.45	29.2	605765432	73210987	4598765430	456789012300	508	69	0h1m43s
22	6.5	29.3	616987654	85321098	4423456780	567890123400	582	93	0h1m44s
23	6.55	29.4	628345678	97432109	4150987650	678901234500	14	1018	0h1m45s
24	6.6	29.5	639876543	10954321	3789012340	789012345600	12	1748	0h1m46s
25	6.65	29.6	651567890	12165432	3338901230	890123456700	2002	1468	0h1m47s
26	6.7	29.7	663456789	13376543	2802345670	987654321000	157	1617	0h1m48s
27	6.75	29.8	675567890	14587654	2178901230	076543210900	2028	1883	0h1m49s
28	6.8	29.9	687890123	15798765	1469012340	156789012300	1338	1463	0h1m50s
29	6.85	30.0	699321098	16909876	0668901230	236789012300	679	691	0h1m51s
30	6.9	30.1	711845678	18020987	0772345670	316789012300	112	695	0h1m52s
31	6.95	30.2	724456789	19132109	0779876540	396789012300	29	371	0h1m53s
32	7.0	30.3	737267890	20243210	0690987650	476789012300	38	1005	0h1m54s
33	7.05	30.4	750267890	21354321	0496789010	556789012300	6	1552	0h1m55s
34	7.1	30.5	763456789	22465432	0216789010	636789012300	8	1097	0h1m56s
35	7.15	30.6	776890123	23576543	0821678900	716789012300	1173	1689	0h1m57s
36	7.2	30.7	790567890	24687654	0311678900	796789012300	27	1400	0h1m58s
37	7.25	30.8	804456789	25798765	0701678900	876789012300	583	2019	0h1m59s
38	7.3	30.9	818567890	26909876	0985678900	956789012300	19	1365	0h2m0s
39	7.35	31.0	832901234	28020987	0999678900	036789012300	20	844	0h2m1s
40	7.4	31.1	847456789	29132109	0934567890	116789012300	427	470	0h2m2s

ü Statistical results

Approximate total time sum (approximate_total aggregate): 3271.6 seconds
Approximate total time average (approximate_total average value): 81.79 seconds

5. User satisfaction of the model

This review used multiple users to rate the overall performance of the DeepSeek 70B and 7B models, with each user giving a score based on their own experience.

user ID	70B model score	7B model score
1	60	70
2	80	60
3	75	40
4	70	40
5	80	60
6	60	60
7	60	70
8	10	30
9	50	70
10	0	60
11	0	50
12	0	40
13	5	10
14	85	60
15	60	50
16	35	20
17	5	60
18	96	80
19	60	60
20	60	20
21	40	20
22	5	5
(grand) total	Average score 45.27	Average score 47.04

ü Statistical results

70B Average model score: 45.27
7B Average model score: 47.05

In terms of average scores, the difference between the two is not significant, and the overall performance satisfaction of the 7b model is slightly better than that of the 70b model, but we need to consider that the 70b model has low user ratings due to too slow response, and the results are not objective enough.
Here's your optimized table with improved formatting, where both "See more products" and "See more content" are now also linked. " are now also linked.

For more products, please check out	See more at
ShirtAI - Penetrating Intelligence	AIGC Big Model: ushering in an era of dual revolution in engineering and science - Penetrating Intelligence
1:1 Restoration of Claude and GPT Official Website - AI Cloud Native	Live Match App Global HD Sports Viewing Player (Recommended) - BlueShirt.com
Transit service based on official API - GPTMeta API	Help, can anyone of you provide some tips on how to ask questions on GPT? - Knowing

categories.

advertising position

Witness the super magic of artificial intelligence together!

Embrace your AI assistant and boost your productivity with just one click!

Content Details

What configurations are needed for the local DeepSeek model and the runtime scores for each configuration

I. Conclusions of the study

1. Overall conclusions

2. Performance of different local models

3. Comparison between local 70b model and official web-based model

4. Evaluation of the number of users carried by different hardware

5. Estimated hardware costs to host 500 simultaneous users

II. Experimental environment and approach

1.DeepSeek Release Notes:

2. Experimental environment

3. Test Methods

III. Summary of data

IV. Data analysis

1. Single card vs. dual card performance comparison

2. Impact of model size on performance

3. Comparison of model response satisfaction

V. Relevant experimental data

1. Single card 70b model

ü Statistical results

2. Single card 7b model

ü Statistical results

3. 5 Dual-Card 7B models

ü Statistical results

4. Dual-Card 7B models for 11 people

ü Statistical results

5. User satisfaction of the model

ü Statistical results

For more products, please check out

See more at

categories.

Newsletter

advertising position

Witness the super magic of artificial intelligence together!

The World's Strongest Artificial Intelligence

Navigation Index

Friendly Link

Contact Us