UpperbodySegmentationSDK is a high-performance human silhouette extraction and masking library built on OpenCV and DeepCore (Deepixel's proprietary library). It leverages TensorFlow Lite models for real-time human detection, precise foreground-background separation, and high-quality segmentation masking.
It outputs a segmentation mask corresponding to the detected human region, providing pixel-level accuracy. This enables you to seamlessly separate the human upper body from the background for various applications.
The SDK outputs a single-channel 8-bit grayscale image of the same resolution as the input. Pixel values range from 0 to 255, where 255 indicates the foreground and 0 indicates the background. Intermediate values may appear near foreground-background boundaries to create a soft transition and reduce aliasing artifacts,
Below is an illustration showing the segmentation coverage:
- Real-time Processing: Fast and accurate upperbody (portrait) segmentation.
- Robust Algorithms: Highly resilient against complex backgrounds, motion blurs, and lighting changes.
- Multi-person Support: Supports simultaneous segmentation of multiple people in a single frame.
- Ultra-lightweight: Works on CPU only — no GPU required.
- Cross-Platform: Supported on iOS, Android, Windows, macOS, and Web.
- Versatile Use Cases: Ideal for virtual backgrounds (video conferencing), portrait mode (background blur), AR try-on, live streaming, and photo editing.
Note: GPU Usage is 0% (Only CPU Inference is used).
| Phone Model | Year | CPU Name | Tier | Inference Time (ms) |
CPU Usage (%) |
|---|---|---|---|---|---|
| Samsung Galaxy S9 | 2018 | Exynos 9810 | Android-Low | 7-8 | 2.75 |
| Samsung Galaxy A52s | 2021 | Snapdragon 778G | Android-Mid | 8.15 | 3.04 |
| Samsung Galaxy Fold3 | 2021 | Snapdragon 888 | Android-High | 4.99 | 1.86 |
| Samsung Galaxy S22 | 2022 | Snapdragon 8 Gen 1 | Android-High | 4.25 | 1.58 |
| iPhone 6s | 2015 | A9 | iOS-Low | 12-13 | 19.5-20.5 |
| iPhone XS | 2018 | A12 Bionic | iOS-Mid | 5.55 | 3-4 |
| iPhone 14 Pro | 2022 | A16 Bionic | iOS-High | 3.09 | 1.8 |
- Inference Time reflects only the internal model execution time
| Metric | Value | Description |
|---|---|---|
| Mean IoU Mean Boundary F1 Mean Boundary IoU |
85.83% 86.88% 76.27% |
PP-HumanSeg14K validation dataset (2,431 images) |
| Mean IoU Mean Boundary F1 Mean Boundary IoU |
97.28% 95.13% 83.24% |
EasyPortrait test dataset (4,000 images) |
| Mean IoU Mean Boundary F1 Mean Boundary IoU |
96.25% 91.66% 76.15% |
EG1800 dataset (1,736 images) |
| 플랫폼 | 파일 | Uncompressed | Compressed (gzip) |
|---|---|---|---|
| Android arm64-v8a | libhumanx_native_android.so | 7.38 MB | 3.16 MB |
| Android armeabi-v7a | libhumanx_native_android.so | 4.60 MB | 2.50 MB |
| iOS arm64 | libhumanx_native_ios.dylib | 11.63 MB | 4.60 MB |
| macOS arm64 | libhumanx_native_mac.dylib | 14.64 MB | 5.29 MB |
| WASM (js + wasm) | humanx_native_wasm.* | 6.69 MB | 2.11 MB |
Created Date: 2026-04-14
- Main namespace (C++):
xyz::deepixelandxyz::deepixel::coreai - Main module (NAVER):
UpperbodySegmentationSDK
humanx/humanx_naver.hpp: C++ umbrella headerhumanx/humanx_naver_c.h: C umbrella headerhumanx/humanx_coreai_naver.hpp: C++ CoreAI umbrella headerhumanx/humanx_coreai_naver_c.h: C CoreAI umbrella header
humanx/coreai/coreai_upper_body_segmentation_naver.hpphumanx/coreai/coreai_upper_body_segmentation_naver_c.h
humanx/image_type.hpphumanx/image_type_c.hhumanx/image_buffer.hpphumanx/image_buffer_c.hhumanx/frame_input.hpphumanx/frame_input_c.hhumanx/upper_body_segmentation_types.hpphumanx/upper_body_segmentation_types_c.hhumanx/humanx_export.h
Supported input image formats.
Description: Describes pixel layout of input frames used by processing APIs.
Values:
BGR_888- 3-channel BGR, 8 bits per channel. Preferred internal format for algorithms.RGB_888- 3-channel RGB, 8 bits per channel.YUV_420_888- YUV 4:2:0 semi-planar format (NV21-compatible conversion path).RGBA_8888- 4-channel RGBA, 8 bits per channel.BGRA_8888- 4-channel BGRA, 8 bits per channel. Recommended for iOS readback paths.
Platform Recommendations:
- Android:
YUV_420_888orRGBA_8888 - iOS:
BGRA_8888
Performance Note: BGR_888 is the fastest path because algorithms use BGR natively and typically avoid an additional color conversion step.
Lightweight image buffer container with optional ownership.
Description: This type stores image shape/stride metadata and a pointer to pixel bytes. It supports two memory modes:
- View mode: References external memory (
ownsData=false) - Owned mode: Stores bytes in
ownedData(ownsData=true)
Fields:
int32_t width- Image width in pixels.int32_t height- Image height in pixels.int32_t step- Number of bytes per row.int32_t channels- Number of channels per pixel.const uint8_t* data- Pointer to first pixel byte (may point to external or owned memory).size_t dataSize- Total readable bytes atdata.bool ownsData- True if this instance owns pixel bytes viaownedData.std::vector<uint8_t> ownedData- Owned storage used whenownsData=true.
Methods:
void setOwnedCopy(const uint8_t* srcData, size_t srcSize)- Copy external bytes into owned storage.void setOwnedMove(std::vector<uint8_t>&& srcData)- Move byte vector into owned storage without copying.void setView(const uint8_t* srcData, size_t srcSize)- Reference external memory without taking ownership.bool empty() const- Check whether buffer is effectively empty or invalid.void clear()- Reset metadata and release owned storage.
Helper Conversions:
ImageBuffer fromCImageBuffer(const DpxlImageBuffer& src, bool copyData = false)- Convert C to C++.DpxlImageBuffer toCImageBuffer(const ImageBuffer& src)- Convert C++ to C.
Input frame descriptor for image processing APIs.
Description: This structure describes a single image frame and its metadata. The image buffer is supplied by the caller through frame.
Buffer Interpretation:
frame.datapoints to the first byte of the image buffer.frame.dataSizeis the total readable byte size.frame.widthandframe.heightare in pixels.frame.strideis the byte distance between adjacent rows.imageTypedefines pixel format and channel layout.
Fields:
unsigned int timestamp- Frame timestamp propagated by the caller (unit defined by caller).ImageBuffer frame- Input frame image buffer and memory layout information.DP_IMAGE_TYPE imageType- Pixel format offrame. Default:BGRA_8888(recommended for GL readback paths).bool isStill- True for still image input, false for streaming/video frame input.
UpperbodySegmentationSDK inference result.
Description: Contains processing status and segmentation output for a single frame. The output mask represents the detected human region with pixel-level foreground confidence.
Fields:
unsigned int timestamp- Frame timestamp propagated from input context.bool success- True when segmentation succeeded andsegmentationMaskis valid.ImageBuffer segmentationMask- Pixel-levelUpperbodySegmentationSDKmask image.
Segmentation Mask Characteristics:
- Single-channel 8-bit grayscale image (or equivalent binary matrix)
- Same resolution as input image
- Foreground confidence per pixel:
- 255: confirmed foreground (human region)
- 1–254: soft edge / transition region
- 0: background
High-performance UpperbodySegmentationSDK engine.
Description: This class performs real-time foreground/background separation. It generates a segmentation mask for the detected human region, enabling clean upper-body extraction from the background.
Output Mask Characteristics:
- Single-channel 8-bit grayscale image
- Same resolution as input image
- Each pixel ranges from 0–255, representing foreground probability.
- Pixel-level foreground confidence:
255: confirmed foreground (human region)1–254: soft edge / transition region0: background- Users may threshold (e.g., >128) to obtain a binary mask.
Lifecycle:
-
DpxlCoreAIUpperBodySegmentationNaver* dpxl_coreai_upperbodyseg_naver_create(void)- Create and initialize CoreAIUpperBodySegmentationNaver instance.
- Returns: Opaque pointer to CoreAIUpperBodySegmentationNaver instance.
-
void dpxl_coreai_upperbodyseg_naver_destroy(DpxlCoreAIUpperBodySegmentationNaver* handle)- Destroy CoreAIUpperBodySegmentationNaver instance.
- Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance.
Initialization:
-
bool dpxl_coreai_upperbodyseg_naver_initialize(DpxlCoreAIUpperBodySegmentationNaver* handle)- Initialize
UpperbodySegmentationSDKwith license key. - Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance. - Returns:
trueif initialization succeeds,falseotherwise.
- Initialize
-
bool dpxl_coreai_upperbodyseg_naver_is_initialized(DpxlCoreAIUpperBodySegmentationNaver* handle)- Check if algorithm is initialized.
- Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance. - Returns:
trueif initialized,falseotherwise.
Inference:
bool dpxl_coreai_upperbodyseg_naver_process(DpxlCoreAIUpperBodySegmentationNaver* handle, const DpxlFrameInput* data, double blurSigma, DpxlCoreAIUpperBodySegmentationOutput* output)- Process image for
UpperbodySegmentationSDK. - Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance.data- Processing data containing image information.blurSigma- Gaussian blur sigma for segmentation post-processing (default: 0.5).- Lower values → sharper edges, potentially more noise
- Higher values → smoother results, may reduce fine details
output- Output structure to store the result.
- Returns:
trueif processing/result generation succeeds,falseotherwise.
- Process image for
C equivalent of DP_IMAGE_TYPE.
Description: Supported input image formats. Describes pixel layout of input frames used by processing APIs.
Values:
DPXL_IMAGE_TYPE_BGR_888- 3-channel BGR, 8 bits per channel. Preferred internal format for algorithms.DPXL_IMAGE_TYPE_RGB_888- 3-channel RGB, 8 bits per channel.DPXL_IMAGE_TYPE_YUV_420_888- YUV 4:2:0 semi-planar format (NV21-compatible conversion path).DPXL_IMAGE_TYPE_RGBA_8888- 4-channel RGBA, 8 bits per channel.DPXL_IMAGE_TYPE_BGRA_8888- 4-channel BGRA, 8 bits per channel. Recommended for iOS readback paths.
Platform Recommendations:
- Android:
DPXL_IMAGE_TYPE_YUV_420_888orDPXL_IMAGE_TYPE_RGBA_8888 - iOS:
DPXL_IMAGE_TYPE_BGRA_8888
Performance Note: DPXL_IMAGE_TYPE_BGR_888 is the fastest path because algorithms use BGR natively and typically avoid an additional color conversion step.
C image buffer descriptor with optional ownership.
Description: Stores image shape/stride metadata and a pointer to pixel bytes.
Fields:
int32_t width- Image width in pixels.int32_t height- Image height in pixels.int32_t step- Number of bytes per row.int32_t channels- Number of channels per pixel.const uint8_t* data- Pointer to first pixel byte.size_t dataSize- Total readable bytes.int32_t ownsData- Ownership flag:0: non-owning view (external memory)1: owning semantics (owned memory)
C input frame descriptor for image processing APIs.
Description: Describes a single image frame and its metadata in C format.
Fields:
uint32_t timestamp- Frame timestamp propagated by the caller.const void* data- Pointer to image buffer.size_t dataSize- Total readable bytes.int32_t width- Image width in pixels.int32_t height- Image height in pixels.int32_t stride- Number of bytes per row.DpxlImageType imageType- Pixel format of the image.bool isStill- True for still image input, false for streaming/video frame input.
C UpperbodySegmentationSDK inference result.
Description: Contains processing status and segmentation output for a single frame.
Fields:
uint32_t timestamp- Frame timestamp propagated from input context.bool success- True when segmentation succeeded andsegmentationMaskis valid.DpxlImageBuffer segmentationMask- Pixel-levelUpperbodySegmentationSDKmask image.
Segmentation Mask Characteristics:
- Single-channel 8-bit grayscale image (or equivalent binary matrix)
- Same resolution as input image
- Foreground confidence per pixel:
255: confirmed foreground (human region)1–254: soft edge / transition region0: background
typedef struct DpxlCoreAIUpperBodySegmentationNaver DpxlCoreAIUpperBodySegmentationNaver;
Description: Opaque handle for the C++ CoreAIUpperBodySegmentationNaver instance. This handle refers to a high-performance UpperbodySegmentationSDK engine that performs real-time foreground/background separation and generates a segmentation mask for the detected human region, enabling clean upper-body extraction from the background.
Output Mask Characteristics:
- Single-channel 8-bit grayscale image
- Same resolution as input image
- Each pixel ranges from 0–255, representing foreground probability.
- Pixel-level foreground confidence:
255: confirmed foreground (human region)1–254: soft edge / transition region0: background- Users may threshold (e.g., >128) to obtain a binary mask.
Creation and Destruction:
-
DpxlCoreAIUpperBodySegmentationNaver* dpxl_coreai_upperbodyseg_naver_create(void)- Create and initialize CoreAIUpperBodySegmentationNaver instance.
- Returns: Opaque pointer to CoreAIUpperBodySegmentationNaver instance.
-
void dpxl_coreai_upperbodyseg_naver_destroy(DpxlCoreAIUpperBodySegmentationNaver* handle)- Destroy CoreAIUpperBodySegmentationNaver instance.
- Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance.
Initialization:
-
bool dpxl_coreai_upperbodyseg_naver_initialize(DpxlCoreAIUpperBodySegmentationNaver* handle)- Initialize
UpperbodySegmentationSDKwith license key. - Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance. - Returns:
trueif initialization succeeds,falseotherwise.
- Initialize
-
bool dpxl_coreai_upperbodyseg_naver_is_initialized(DpxlCoreAIUpperBodySegmentationNaver* handle)- Check if algorithm is initialized.
- Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance. - Returns:
trueif initialized,falseotherwise.
Inference:
bool dpxl_coreai_upperbodyseg_naver_process(DpxlCoreAIUpperBodySegmentationNaver* handle, const DpxlFrameInput* data, double blurSigma, DpxlCoreAIUpperBodySegmentationOutput* output)- Process image for
UpperbodySegmentationSDK. - Parameters:
handle- Pointer to CoreAIUpperBodySegmentationNaver instance.data- Processing data containing image information.blurSigma- Gaussian blur sigma for segmentation post-processing (default: 0.5).- Lower values → sharper edges, potentially more noise
- Higher values → smoother results, may reduce fine details
output- Output structure to store the result.
- Returns:
trueif processing/result generation succeeds,falseotherwise.
- Process image for
#include <humanx/humanx_naver.hpp>
using namespace xyz::deepixel;
using namespace xyz::deepixel::coreai;
CoreAIUpperBodySegmentationNaver engine;
if (engine.initialize()) {
FrameInput in;
// fill in.frame / in.imageType / in.timestamp / in.isStill
auto out = engine.process(in, 0.5);
}#include <humanx/humanx_naver_c.h>
DpxlCoreAIUpperBodySegmentationNaver* h = dpxl_coreai_upperbodyseg_naver_create();
if (h && dpxl_coreai_upperbodyseg_naver_initialize(h)) {
DpxlFrameInput in = {0};
DpxlCoreAIUpperBodySegmentationOutput out = {0};
dpxl_coreai_upperbodyseg_naver_process(h, &in, 0.5, &out);
}
dpxl_coreai_upperbodyseg_naver_destroy(h);HumanXNative(options?) -> Promise<Module>- Emscripten module factory exported by
humanx_native_wasm.js. - Typical usage:
const wasmModule = await HumanXNative({ locateFile: (path) => path });
- Emscripten module factory exported by
Common runtime exports:
_malloc(size)- Allocate bytes in WASM heap._free(ptr)- Free previously allocated heap pointer.HEAPU8- Uint8 view for raw byte copy (HEAPU8.set(...)).
In this repository, the recommended JS surface is the wrapper in samples-naver/wasm-web-sample/humanx-wrapper.js:
class CoreAIUpperBodySegmentationconstructor(wasmModule)async initialize(): Promise<boolean>async isInitialized(): Promise<boolean>isProcessing(): booleanasync process(timestamp, dataPtr, dataSize, width, height, stride, imageType, isStill, blurSigma = 0.5): Promise<ProcessResult>destroy(): void
ProcessResult shape:
success: booleantimestamp: numbersegmentationMaskWidth: numbersegmentationMaskHeight: numbersegmentationMaskStep: numbersegmentationMask: number(raw pointer in WASM heap)
CoreAIUpperBodySegmentation.process(...) parameters map to C/C++ API concepts as follows:
timestamp->DpxlFrameInput.timestampdataPtr+dataSize->DpxlFrameInput.data+DpxlFrameInput.dataSizewidth/height/stride->DpxlFrameInput.width/height/strideimageType->DpxlFrameInput.imageTypeisStill->DpxlFrameInput.isStillblurSigma->dpxl_coreai_upperbodyseg_naver_process(..., blurSigma, ...)
- Browser sample uses RGBA input from Canvas
ImageData. - Example:
wasmModule.ImageType.RGBA_8888(converted to integer enum value before passing toprocess).
const wasmModule = await HumanXNative({ locateFile: (path) => path });
const seg = new CoreAIUpperBodySegmentation(wasmModule);
const ok = await seg.initialize();
if (!ok) throw new Error('initialize failed');
const imageSize = width * height * 4; // RGBA
const imagePtr = wasmModule._malloc(imageSize);
wasmModule.HEAPU8.set(rgbaBytes, imagePtr);
const imageType = Number(wasmModule.ImageType.RGBA_8888);
const result = await seg.process(
Date.now() >>> 0,
imagePtr,
imageSize,
width,
height,
width * 4,
imageType,
true,
0.5
);
wasmModule._free(imagePtr);
seg.destroy();- Always free input buffers allocated with
_malloc. result.segmentationMaskis a raw pointer to WASM memory, not a copied JS array.- If you need persistent mask data, copy bytes out immediately (for example into
Uint8Array) before subsequent native calls or destroy operations.
Wrapper internally uses embind classes exposed by WASM runtime, including:
FrameInputImageBufferCoreAIUpperBodySegmentationNaver
Advanced users can call these directly, but wrapper usage is preferred for predictable object lifecycle and normalized result conversion.

