小提琴图(Julia)

作者

[编辑] Bizard 团队

修改于

2026-04-04

🤖 AI Skill — Copy this tutorial's skill into your AI assistant

小提琴图将箱线图统计信息与核密度估计相结合,展示数据分布。Julia 的 CairoMakie 可以方便地创建小提琴图,用于比较不同组之间的基因表达、生物标志物水平或临床测量值。

示例

环境配置

  • 系统要求:跨平台(Linux/MacOS/Windows)
  • 编程语言:Julia
  • 依赖包:CairoMakieDataFramesStatistics
using CairoMakie
using DataFrames
using Statistics
using Random

数据准备

Random.seed!(42)
n_per = 80
groups = vcat(fill("Tumor", n_per), fill("Normal", n_per), fill("Adjacent", n_per))
expression = vcat(
    randn(n_per) .* 1.5 .+ 8,
    randn(n_per) .* 1.2 .+ 5,
    randn(n_per) .* 1.8 .+ 6.5
)
group_idx = vcat(fill(1, n_per), fill(2, n_per), fill(3, n_per))
df = DataFrame(Group=groups, Expression=expression, GroupIdx=group_idx)
240×3 DataFrame
215 rows omitted
Row Group Expression GroupIdx
String Float64 Int64
1 Tumor 7.45496 1
2 Tumor 8.37761 1
3 Tumor 7.52752 1
4 Tumor 7.53312 1
5 Tumor 9.22446 1
6 Tumor 8.71511 1
7 Tumor 6.71067 1
8 Tumor 5.79607 1
9 Tumor 4.8285 1
10 Tumor 8.06567 1
11 Tumor 6.762 1
12 Tumor 9.26043 1
13 Tumor 8.65083 1
229 Adjacent 8.67822 3
230 Adjacent 5.35136 3
231 Adjacent 6.22416 3
232 Adjacent 3.53355 3
233 Adjacent 8.33217 3
234 Adjacent 6.73736 3
235 Adjacent 11.0713 3
236 Adjacent 6.77344 3
237 Adjacent 7.34419 3
238 Adjacent 5.31451 3
239 Adjacent 5.72756 3
240 Adjacent 3.46341 3

可视化

基础小提琴图

fig = Figure(size=(700, 500))
ax = Axis(fig[1,1], xlabel="Group", ylabel="Expression Level",
          title="Gene Expression Distribution",
          xticks=(1:3, ["Tumor", "Normal", "Adjacent"]))
violin!(ax, df.GroupIdx, df.Expression, color=(:steelblue, 0.7))
fig
图 1: 基因表达基础小提琴图

带箱线图叠加的分组小提琴图

fig2 = Figure(size=(700, 500))
ax2 = Axis(fig2[1,1], xlabel="Group", ylabel="Expression Level",
           title="Violin + Box Plot",
           xticks=(1:3, ["Tumor", "Normal", "Adjacent"]))
colors = [:red, :steelblue, :green]
for (i, g) in enumerate(["Tumor", "Normal", "Adjacent"])
    mask = df.Group .== g
    vals = df.Expression[mask]
    violin!(ax2, fill(i, sum(mask)), vals, color=(colors[i], 0.4), side=:left)
    boxplot!(ax2, fill(i, sum(mask)), vals, color=(colors[i], 0.7), width=0.3)
end
fig2
图 2: 小提琴图 + 箱线图叠加

半小提琴(雨云)图

fig3 = Figure(size=(800, 500))
ax3 = Axis(fig3[1,1], xlabel="Group", ylabel="Expression Level",
           title="Raincloud Plot",
           xticks=(1:3, ["Tumor", "Normal", "Adjacent"]))
for (i, g) in enumerate(["Tumor", "Normal", "Adjacent"])
    mask = df.Group .== g
    vals = df.Expression[mask]
    violin!(ax3, fill(i, sum(mask)), vals, color=(colors[i], 0.5), side=:right)
    scatter!(ax3, fill(i, sum(mask)) .- 0.15 .+ randn(sum(mask)) .* 0.03,
             vals, color=(colors[i], 0.3), markersize=5)
end
fig3
图 3: 雨云风格的半小提琴图

参考文献

  1. Danisch, S., & Krumbiegel, J. (2021). Makie.jl: Flexible high-performance data visualization for Julia. JOSS, 6(65), 3349.