Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: video: image statistics, libcamera-style APIs #85457

Open
josuah opened this issue Feb 9, 2025 · 3 comments
Open

drivers: video: image statistics, libcamera-style APIs #85457

josuah opened this issue Feb 9, 2025 · 3 comments
Assignees
Labels
area: API Changes to public APIs area: Drivers area: Video Video subsystem RFC Request For Comments: want input from the community

Comments

@josuah
Copy link
Collaborator

josuah commented Feb 9, 2025

Introduction

Cameras constantly adjust the colors to make image normal-looking in all conditions. To do this, Linux uses libcamera, which Zephyr lacks.

Problem description

There is no API for sending image statistics from the hardware back to the software.

By default, image sensors output is green and dark. Below the Zephyr kite under a bright spotlight, using default exposure level:

Image

Some image sensors such as OV5640 come with built-in image correction so the image looks normal... at the cost of extra noise, so most sensors let Image Signal Processor (ISP) handle perform the color corrections.

Image

There is no statistics format between the ISP hardware and the IPA software.

Image

Proposed change

At first, focus on introducing APIs for enabling the application to implement the rest: image statistics reporting.

Detailed RFC

Proposed change (Detailed)

Implement a new API for collecting statistics, or reuse one of the existing for this purpose.

[ See below for updated version ]

This does not cover advanced features like "region of interest" (ROI, i.e. on a phone, tap on a corner of the display to focus there => needs API to ask the hardware to collect statistics for a smaller region of the display in particular).

Dependencies

Concerns and Unresolved Questions

What API to use?

  • The existing control API? (however, statistics are not )
  • A new dedicated API?

What statistics format to use? libcamera defines sum_red, sum_green, sum_blue, histogram_brightness for the software back-end

Alternatives

  • No standardized API, each hardware have their own APIs in custom headers.
  • Re-use the Video Controls API for this with a custom struct
@josuah josuah added area: Drivers area: Video Video subsystem RFC Request For Comments: want input from the community labels Feb 9, 2025
@josuah josuah self-assigned this Feb 9, 2025
@josuah
Copy link
Collaborator Author

josuah commented Feb 9, 2025

One alternative to having the R/G/B sum is to have a histogram for R, G, B as well, and define a bucket number of 1, which is giving the same value, except also supporting hardware reporting R, G, B histograms in the future if any.

This also allows to remove the bitmask, as a hardware not supporting any histogram of some type can set the number of buckets to 0. [EDIT: This does not work. 😖 with just one bucket we would know that all values are between min and max, not their sum.]

Example of complete ISP pipeline for desktop in Python:
https://github.com/cruxopen/openISP

@rruuaanng rruuaanng added the area: API Changes to public APIs label Feb 9, 2025
@josuah
Copy link
Collaborator Author

josuah commented Feb 14, 2025

After trying to implement something, it seems like some bitmap to negotiate the type of stats could be useful:

     Application                        Video device                          Hardware

          |                                   |                                   |
          | Bitmap of stats supported         |                                   |
          |---------------------------------->|                                   |
          |                                   | Fetches the stats from hardware   |
          |                                   | to (struct image_stats)           |
          |                                   |---------------------------------->|
          |                                   |<----------------------------------|
          | Bitmap of stats actually present  |                                   |
          | with just one bit set             |                                   |
          |<----------------------------------|                                   |
          |                                   |                                   |

This would permit the statistics to be self-describing (through the bitmap always as first field), and avoid too much ping-pong.

There can be functions that convert the stats from one format to another, and the hardware does not have to send the stats requested by the application: only preferences. This permits to implement all sorts of dynamic strategies (first ask for channel statistics, then ask for full R/G/B histogram if not supported) without pushing complexity in the drivers.

This should help with integrating together the "statistics zoo" that hardware might present, and allow very custom algorithms implemented with a generic API.

For instance:

/** Complementary flag to the other channel statistics indicating that values are sums. */
#define VIDEO_STATS_CHANNELS_ARE_SUMS BIT(0)

/** Complementary flag to the other channel statistics indicating that values are averages. */
#define VIDEO_STATS_CHANNELS_ARE_AVERAGES BIT(1)

/** Channel statistics where CH0 is red, CH1 is 1st green, CH2 is 2nd green, CH3 is blue. */
#define VIDEO_STATS_CHANNELS_RGGB BIT(2)

/** Channel statistics where CH0 is Y (luma), CH1 is red, CH2 is green, CH3 is blue. */
#define VIDEO_STATS_CHANNELS_YRGB BIT(6)

/** Channel statistics where CH0 is Y (luma), CH1, CH2, CH3 are unused. */
#define VIDEO_STATS_CHANNELS_Y BIT(7)

/** Channel statistics where CH0 is unused, CH1 is red, CH2 is green, CH3 is blue. */
#define VIDEO_STATS_CHANNELS_XRGB BIT(8)

/** Statistics in the form of an histogram. R (red) channel only. */
#define VIDEO_STATS_HISTOGRAM_R BIT(9)

/** Statistics in the form of an histogram. G (green) channel only. */
#define VIDEO_STATS_HISTOGRAM_G BIT(10)

/** Statistics in the form of an histogram. B (blue) channel only. */
#define VIDEO_STATS_HISTOGRAM_B BIT(11)

/** Statistics in the form of an histogram. Y (luma) channel only. */
#define VIDEO_STATS_HISTOGRAM_Y BIT(12)

/** Statistics in the form of an histogram. U (Cb, blueness) channel only. */
#define VIDEO_STATS_HISTOGRAM_U BIT(13)

/** Statistics in the form of an histogram. V (Cr, redness) channel only. */
#define VIDEO_STATS_HISTOGRAM_V BIT(14)

/** Statistics in the form of a custom format, not covered by the Zephyr Video APIs. */
#define VIDEO_STATS_CUSTOM BIT(31)

/**
 * @brief Statistics where every channels represent one of RGB without luma channel.
 *
 * CH0 is unused, CH1 is red, CH2 is green, CH3 is blue.
 */
#define VIDEO_STATS_CHANNELS_YUV BIT(8)

struct video_stats {
        uint32_t type_bitmap;
};

/**
 * @brief Statistics about the video image color content.
 *
 * Used by software algorithms to control the color balance such as White Balance (AWB),
 * Black Level Correction (BLC), or control sensors such as Exposure/Gain Control (AEC/AGC).
 */
struct video_channel_stats {
        struct video_stats base;
        uint64_t ch0;
        uint64_t ch1;
        uint64_t ch2;
        uint64_t ch3;
};

/**
 * @brief Statistics about the video image color content.
 *
 * Used by software algorithms to control the color balance such as White Balance (AWB),
 * Black Level Correction (BLC), or control sensors such as Exposure/Gain Control (AEC/AGC).
 */
struct video_histogram {
        struct video_stats base;
        uint64_t *buckets;
        size_t num_buckets;
};

@josuah
Copy link
Collaborator Author

josuah commented Feb 14, 2025

Example of intermediate "negotiation" in a situation where there is a lot of different statistics and a "good enough" Image Processing Algorithm (IPA) needs to be provided for a slightly mismatching API, without having to specialize the ISP to the hardware manually:

Application                 Video Device                 Library function
  |                              |                              |
  |                              |                              |
  | I want one of these stats    |                              |
  |--(bitmask)------------------>|                              |
  |                              |                              |
  |                              |                              |
  |     I have none of that, but |                              |
  |   here is a simple Y channel |                              |
  |                      average |                              |
  |<------------(stats+bitmask)--|                              |
  |                              |                              |
  |                              |                              |
  | I do not support these stats, here is what I received, here |
  | is the bitmask of what I support.                           |
  |--(stats+bitmask)------------------------------------------->|
  |                              |                              |
  |                              |                              |
  |     You give me the R/G/B averages, you want Y channel too, |
  |  I am filling the gaps by averaging the RGB channels into Y |
  |<-------------------------------------------(stats+bitmask)--|
  |                              |                              |
  |                              |                              |
  | Now I can compute my IPA     |                              |
  |                              |                              |
  |                              |                              |
  | Adjust exposure to be 3000   |                              |
  |--(control)------------------>|                              |
  |                              |                              |

This is illustrated here in a basic, ad-hoc implementation:
https://github.com/tinyvision-ai-inc/tinyvision_zephyr_sdk/blob/659ab06ec5391830e456922f0109ca43e1b46602/samples/imx219_dual/src/main.c#L55-L66

Then, providing an ISP to a particular system becomes a matter of having either:

  • The conversion functions between what the hardware has and what the IPA supports
  • An IPA that immediately supports the statistics provided by the hardware

But not manually re-doing the IPA and adapt it to every new hardware, even optimized ones can be added incrementally and ported to other hardware (if it supports the same stats).

josuah added a commit to tinyvision-ai-inc/zephyr that referenced this issue Feb 14, 2025
Introduce an abstraction layer handling the diversity of ways hardware
have to report statistics. This allows to take advantage of the various
channel average or histograms present on some hardware, that skip the
need to manually compute statistics.
Fixes zephyrproject-rtos#85457

Signed-off-by: Josuah Demangeon <me@josuah.net>
josuah added a commit to tinyvision-ai-inc/zephyr that referenced this issue Feb 14, 2025
Introduce an abstraction layer handling the diversity of ways hardware
have to report statistics. This allows to take advantage of the various
channel average or histograms present on some hardware, that skip the
need to manually compute statistics.
Fixes zephyrproject-rtos#85457

Signed-off-by: Josuah Demangeon <me@josuah.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Drivers area: Video Video subsystem RFC Request For Comments: want input from the community
Projects
Status: No status
Development

No branches or pull requests

2 participants