It’s been a while I have written a blog, this is the right time I guess. So lately I am trying to do a kind of research with video encoding in power constrained devices as part of my undergrad thesis project. So I was going into more details about encoders in general. After that, I have also gone through the details about HEVC, AV1 and all.
We all know about very positive development advances in AV1 when we see with HEVC. So if you are not following anything in this digital video field, AV1 is next generation and new standard after HEVC.It could have like 40% better picture quality than HEVCcross-platform but as_of_now the main issue or the main disadvantage of av1 is the fact that it is very compute intensive, for solving this issue the much required thing would be having a nice software which does a full analysis of the video including the metadata.
Now there is a tool called AOM-Analyser which is there for a while
This tool is a bitstream analysis which does various analysis of a video. This is a cross platform tool for macOS, Windows and also Linux. This is also open sourced at GitHub. You can grab it if you want to try out for yourself. We can also download the pre compiled ones and use it if you dont want to build your own. There is a very nice live demo of the tool for anyone to see before downloading.
You can refer to the GitHub readme.
The Analyser is made of two components one is a emscripten compiled version of the codec( decoder.js) and another file which is the frontend of app based of HTML and Typescript.
Anyone can do this analysis of the video, all he/she need to do is specify a video file possibly in “*.ivf” format and also appropriate emscripten script compiled version of the decoder to decode it. The format of this would be like
How to Run
We can use electron for running the analyzer
electron . decoder1 video1
Both decoder and video parameters can either point to a local file or we could also use a remote url.
If you didn’t specify any arguments then it will select any random jobs from areWeCompressedYet.
Like I said earlier if you did download a standalone package of the tool then you don’t need to use electron, you can directly run the binary directly from command line
We can also use multiple decoders for analysis of the video.
There are various other command line options for doing this ie.
- –zoomFactor with this a user can set the default zoom level, suppose if it is too large for your resolution then sets the default zoom level, if you feel that the UI elements are too large then you can set it to either 0.75 or 0.50
- –dev if you pass this then it will open up the electron dev tools which is disabled by default.
- –frames which can be used to the specific number of frames we need to decode, by default it is set to 4.
- –split this is a very useful flag which can be used for doing a very good side by side comparison of the video. For this flag to work you might need to do specify at least two videos. When we use this the tool doesn’t show any analyzer layers which does improve decoding faster.
- Left (first video)
- Right (second video)
- Vertical Split (first video on the left, second video on the right)
- Horizontal Split (first video on the top, second video on the bottom)
The best part of this is that everything you did can be easily shared and all this is inside the browser. The decoder JS are generated automatically and submitted to AreWeCompressedYet so anyone can use it.
You can download videos from here for doing the encoding in local machine. I will recommend using in_to_tree, ducks_take_off, crowd_run, old_town_cross, these all are 8bit, one could use either 1080 or 720p as it will be better for doing the test.
AOM Analyzer has a toolbar at the top
This Toolbar has different options like
- Layers: We can toggle various layers on and off from. With this we can understand what are the different things in the video and can be used to inspect their value in the block info tab. These are the various properties you can toggle from the layer:
- Decoded Image: Shows the decoded image
- Super block grid: Show the maximum possible blocks for the frame.The AV1 supports maximum of 64×64 Blocks while the minimum is 4×4.
- Split Grid: Shows the split grid of the frame using intra frame prediction.
- Transform grid: Shows the transform layer of the frame using intra frame prediction.
- Transform Type: Shows the transform type of the frame.
- Motion Vector: Displays the motion vectors of the frame using inter frame prediction
- Frame reference:
- UV Mode:
- Show segment: Overlays the segment of the video.
- Bits: Show bit accounting layer, to make it more precise we can disable the image and see. There are three options for bit scale
- Frame Relative: For single frame
- Video Relative: Maximum no of bits is computed over all frames in video sequence.
- Video Relative(all): Same as the video relative, the only difference is it takes all sequence which are loaded so it is useful when we want to compare frames b/w two sequence.
- Heat Map: Shows the bits in a heat map style
- Heat Map(Opaque): This shows in heat map with additional color like blue etc for areas where there is no heatmap generated
- We can also toggle some specific accounting layer’s symbol to get additional information. This can be useful to dive into the bit distribution of a specific symbol.
- Skip: This is mainly used to show whether that a block has coffienets. Usually they are drawn as Voilet/Blue. If we see carefully we can say that bits are overlayed in non-skipped areas.
- Filiters: Show the filiter for the specific frame.
- CDEF: Constrained Directional Enhancement Filter this is built out of three pieces (directional search, the constrained replacement/lowpass filter, and integer-pixel tap placement) that we’ve used before in daala analysis. CDEF is intended to remove or reduce basis noise and ringing around hard edges in an image without blurring or damaging the edge
- Tiles: Add an outline tile to the frame.
- Save Image: To save the image
- Reset : Resets the analyzer state to the first frame and clears all layers.
- Previous: To go back to previous frame
- Play/Pause: To play/pause the video
- Next: To go to the next frame
- Zoom Out: To zoom out
- Zoom In: To Zoom In.
- Decode Additional 30 Frames: Decode 30 more frames in a background thread. This may take a while but you should still be able to use the analyzer while that is happening.
- Share: Create a simple shortened url of the analysis
The very top tabs let you toggle between videos. For this we could also use number keys.
Current Video Quick Info
The red bar at the top gives us a quick insight of the video’s current frame provides quick info about the current frame.
- Zoom: Click anywhere on the decoded image to zoom in on it.
- Histograms: This section of the toolbar plots histogram graph with reference to the video frame. There are different modes of the histogram like
- Bits: Shows number of bits spent in the frame
- Symbols: Shows different percentage of spent on each symbol type like
- Block Size: Shows the percentage of pixels within a single block size.
- Transform Size: Displays number of pixel inside a transform size.
- Transform Type: % of pixels within a transform type.
- UV Prediction Mode: % of pixels within a prediction mode.
- Skip: % of pixels skipped.
- Dual Filter type
- Block Info: Per selected block information and accounting. (When you click on the decoded image, you’ll see an orange rectangle that highlights the selected block.). These are things which you could get from this section.
- Block Position
- Block Size
- Transform Size
- Transform Type
- UV Mode
- Motion Vectors
- Reference Frame
- Dual Filter Type
- DeltaQ Index
- Segment ID
- Accounting Table: This section deals with the more minor details of a single frame’s block, they always keep track of the number of bits spent on each symbol in the bitstream or in other words has block level context. The accounting tables show the following data:
- Symbols: Shows the different symbol types. For eg. read_coeffs_reverse_2d, av1_read_coeffs_txb, read_golomb, read_coeffs_reverse, decode_coefs etc are some examples of symbol type, each frame will have these differently.
- Bits: No of bits spend on a symbol in a particular frame
- Percentage: The percentage which is relative to the total number of bits spent in the block (or frame)
- Samples: No of samples read in a specific frame.
- Frame Info: Per frame information and accounting. For this, there are various factors like Block info. They are
- Frame Type
- Show Frame
- Frame Size
- MI Size
- DeltaQ Res/ Present Flag
- Accounting Table: Just like block info this also has different symbols bits for eg,
- Symbols: Like the bits these have some more many filters, for eg.read_coeffs_reverse_2d, av1_read_tx_type, read_intra_mode_w etc.
- More: This has few options which are very useful, one is a user can request for a feature, one can file a bug so developers can fix it and make it a better product. Also, this features the option to download the video in the IVF or Y4M format. Lastly, this also shows configurations for the video.
Issues with the current Analyzer
As of now with AOM Analyzer we cannot properly visualize the encoder metadata, we need to find a solution for this. Rav1e team is working on building a solution for this problem. In past Daala analyzer, av1 analyser did some major job for making this task easy but still there is a lot to be improved and a lot of new metadata needs to be analysed. For instance, if we want to get a more rapid analysis of frame allocation within a particular scene, we are not sure about how it could be done.
PS: These information are not 100% accurate, these are my own opinions and have referred to below mentioned websites, for more information you can refer to those websites.