Machine Learning Inference Performance

AIMark 3

AIMark makes use of various vendor SDKs to implement the benchmarks. This means that the end-results really aren’t a proper apples-to-apples comparison, however it represents an approach that actually will be used by some vendors in their in-house applications or even some rare third-party app.

鲁大师 / Master Lu - AIMark 3 - InceptionV3 鲁大师 / Master Lu - AIMark 3 - ResNet34 鲁大师 / Master Lu - AIMark 3 - MobileNet-SSD 鲁大师 / Master Lu - AIMark 3 - DeepLabV3

Unfortunately for the Black Shark 2, the devices lacked the proper drivers to properly run AIMark, and the benchmark repeatedly crashed upon starting the benchmarks. We had the same issue on the OnePlus 7 Pro, pointing out to some software incompatibility.

AIBenchmark 3

AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.

We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initialising the test to actually executing the computation.

AIBenchmark 3 - NNAPI CPU

We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.

AIBenchmark 3 - 1 - The Life - CPU/FP AIBenchmark 3 - 2 - Zoo - CPU/FP AIBenchmark 3 - 3 - Pioneers - CPU/INT AIBenchmark 3 - 4 - Let's Play - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/INT AIBenchmark 3 - 8 - Blur iT! - CPU/FP

In AI Benchmark’s CPU workloads, the Black Shark 2 ends up with a bit of a odd spread of scores. In the shorter running benchmarks the phone is getting relatively average inference times, while on the longer running tests for some reason the BS2 falls behind other S855 devices. In fact it looks like the BS2 is landing as amongst the worse off S855 devices in the latter listed tests.

AIBenchmark 3 - NNAPI INT8

AIBenchmark 3 - 1 - The Life - INT8 AIBenchmark 3 - 2 - Zoo - Int8 AIBenchmark 3 - 3 - Pioneers - INT8 AIBenchmark 3 - 5 - Masterpiece - INT8 AIBenchmark 3 - 6 - Cartoons - INT8

AIBenchmark 3 - NNAPI FP16

AIBenchmark 3 - 1 - The Life - FP16 AIBenchmark 3 - 2 - Zoo - FP16 AIBenchmark 3 - 3 - Pioneers - FP16 AIBenchmark 3 - 5 - Masterpiece - FP16 AIBenchmark 3 - 6 - Cartoons - FP16 AIBenchmark 3 - 9 - Berlin Driving - FP16 AIBenchmark 3 - 10 - WESPE-dn - FP16

AIBenchmark 3 - NNAPI FP32

AIBenchmark 3 - 10 - WESPE-dn - FP32

In the INT8, FP16 and FP32 accelerated tests which make use of acceleration blocks such as the Hexagon DSP and the GPU, we see the Black Shark 2 perform very well and in line with other Snapdragon 855 devices.

Overall, the Black Shark 2 is a good performer in the machine learning inferencing benchmarks, but like other devices, it’s not quite the very best in every regard, pointing out that the vendor could have improved upon its performance by keeping the software stack more up to date with what Qualcomm is offering, a widespread issue that I expect to persist over the next years as the ecosystem quickly evolves.

System Performance GPU Performance - Worst of S855
Comments Locked

63 Comments

View All Comments

  • Flunk - Wednesday, September 25, 2019 - link

    I'd take "gaming" phones more seriously if they actually had better hardware instead of just having obnoxious styling like this thing. I feel like that won't happen because of the way SoCs are developed, so we'll just see more of these cynical products.
  • wrkingclass_hero - Thursday, September 26, 2019 - link

    So it has finally come to paid reviews... this will not end well
  • Andrei Frumusanu - Thursday, September 26, 2019 - link

    It will end come Monday.
  • Galcobar - Thursday, September 26, 2019 - link

    Xiaomi might agree with your sentiment, since Qualcomm paid for a review which called out deceptive practices, poor design, and significant under-performance by a Qualcomm client. Clearly, the payment did not include guarantees of positive coverage or control over the published results. Xiaomi is probably wishing this review hadn't happened, but it does seem to establish the Independence of Anandtech editorial staff to publish a negative review even when sponsored.
  • PeachNCream - Thursday, September 26, 2019 - link

    Pretty much this stuff. It's really hard to question AT's integrity with this particular paid review given the results do not paint the phone in question in a very good light. Will that always be the case? Dunno, but I think probably, yes it will.
  • Imran-Shaikh - Thursday, September 26, 2019 - link

    What benefits AT had through these paid reviews?
    Money or anything else?
    Thanks in advance.
  • Badelhas - Thursday, September 26, 2019 - link

    What do you mean?
  • Average James - Friday, September 27, 2019 - link

    I just ran Slingshot Extreme Unlimited on my own BlackShark 2 with explicitly turned off thermal throttling and maximum speed setting [4] as Gamer Studio allows to tune the option.

    With GPU overlocked via Caller hidden menu, I marked 7100~ish Graphics and 4100-ish Physics which seems legit to its clock setting. While that, the temp marked through 35~38C.

    And then, I voluntary set Gamer Studio level to [2] which actively uses Silver Cores for battery and heats for non-3D heavy games. the result is similar to the article.

    So I wonder, Mr. Andrei might misunderstood about CPU/GPU governor stuff. On Auto setting, it seems natural that SW detects what kind of game or apps which requires how much 3D/CPU power to get most favorable results. Like nowdays modern VGA drivers are doing.
  • Average James - Friday, September 27, 2019 - link

    I can understand you're blaming detecting benchmarks software to turn off thermal throttling as a reviewer. It's generally evil thing to trick customers. BUT this device offers various performance levels through it's exclusive Gamer Studio menu and even allows to set thermal throttling level if you want.

    So what I cannot get from your article is, it doesn't talk about it's real performance. This is just complaining about "Poor performance in Auto perf setting if an App is not registered properly as it uses lower performance (seems level 2) level."
  • s.yu - Friday, September 27, 2019 - link

    "if an App is not registered properly"
    If "real" performance mandates that apps "register" properly, how can you ensure that every game is registered properly then, if "registration" doesn't depend on a load detection?

Log in

Don't have an account? Sign up now