15 Nov

# Research Plan

Information-Theoretical Pose Estimation for Probabilistic Semantic Data Association

• Phase 1: Reproduce EM-Semantic Pose estimation
• Phase 2: Introduce the Entropy of landmarks
• Phase 3: Visulization
• Phase 4: Introduce the Mutual Information
• Phase 5: Marginalization edges of landmarks
• Phase 6: Generate stable object model
• Phase 7: Apply the algorithm on Dataset
• Summarize contribution of the project

## Phase 1: Reproduce EM-Semantic Pose estimation

• In this phase, we supppose the surrondings and objects are static. Moreover, we only consider very few quantities of objects in the environment, for instance, only three objects for localization in a static environment.

• Currently, we regard the regression and classification accuracy in a same index. For example, in YOLO or MASK_RCNN, the classification and regreassion share the same set of parameters. At the same time, the IoU-Net seperates the classifciation and lolalization bounding box into two scores.

• We use ORB_SLAM2 as the basic SLAM structure.

### Key Value

By reproducing EM semantic pose estimation in Probabilistic Data Association for Semantic SLAM, we will present an open-source implementation of the EM Semantic SLAM algorithm. Further, as in this paper and followers, they both omit the details of implementatin of the algorithm, we will give you a detailed illustration of the algorithm.

### TODO list

• [ ] Get detection scores from YOLO/MASK_RCNN network
• [ ] Transfer scores into probablity as the weight for Ceres optimization.
• [ ] Extract the features only in bounding box (bblox) in YOLO/MASK_RCNNN mask area.
• [ ] Realize the iteratively probablity product as the weight of each features.
• [ ] Using those weighted feature match to get the initial pose estimation
• [ ] Reproject prior semantic area into current frame, and find the overlap area to select new weighted features.
• [ ] Update the pose. [One EM iteration loop]

## Phase 2: Introduce the Entropy of landmarks

After the feature extraction and the detection mask of the objects, we assign features with scores (probability) of the classification.

Then using the probalicity to calculate the objects entropy in current frame. Then we use the entropy to set a bar to frames and decide which frame should be considered by SLAM.

$$H(Xi)= g(P(X(i)) = {\rm{ - }}\sum {P(Xi)} \log (P(Xi))$$$Calcualte the discount factor α to evaluate the frame quality. $$\alpha =\frac{H_{i+1}}{H_{i}}$$$

### Key Value

While current approcaches mostly use threshold or engineering tricky to evaluate the observation quality, we will give an information-based evaluation method to find which objects/points are most valuable, which is adaptive but also considering the history of MDP process. The value of this step is to examine how much we can trust the entroy value in SLAM in a real environment.

#### TODO list

• [ ] Add feature's attribute: probablity and entropy
• [ ] Add the information class to save the whole object entropy
• [ ] Set appropriate threshold for frame selection and object selection.
• [ ] Do tests as follows
Iterm Describtion Expected Effect
Original EM Reproduce semantic EM algorithm check the validity EM code
Info-EM Add entropy threshold check the validity Info-EM code
Apriltag disturb using Apriltag as disturbition around the objects check the Info-EM stability

## Phase 3: Visulization

• Color the feature point map into different colors, deeper color means higher probability.
• Different class owns different color.

### Key Value

Intuitively show the result.

#### TODO list

• [ ] Different class owns different color.
• [ ] Different probablitiy points have different color degree.
• [ ] (optional) Draw edges between different objects

## Phase 4: Introduce the Mutual Information

• Define the mutual information equation as:
• $$P(X|Y) = \lambda e^{\lambda r}$$\$

where $$r$$ means the displacement of one ojbect between two frames.

### Key Value

By mutual information estiamation, landmarks' space stability will be quantitative without prior experience. Moreover, the whole process is unsupervised. The value of this step is to examine how much the space stability will affect the localization performance and whether information-theory will automatically discrimate dynamic or static objects.

#### TODO list

• [ ] Add mutual information attribute to feature points
• [ ] Add huber kernel to suppress the sensitive values
• [ ] Using IMU or motion model to get the prediction position, using the mutual information to evaluate the static certainty of objects
• [ ] Using Bundle adjustment to update the pose, and then update the weight of objects. [Another EM loop]
• [ ] Do Test: slowly moving one object, the pose estimation should be stable.

## Phase 5 (Optional): Marginalization edges of landmarks

• Marginalization, simplified the feature representation of the objects.

#### TODO list

• [ ] Left to do, not clear now.

## Phase 6: Generate stable object model

• How to select the feature points to generate the stable object model?
• After generating the stable model, we can use the edges inside of the model as constraint to increase the performance of pose estimation.

### Key Value

The step wiill examine whether we can obtain the stable observation points or not during short-term observation. If possible, our algorithm will automatically generate object model in a unsupervised way.

## Phase 7: Apply the algorithm on Dataset

• Indoor dataset first.

# Summarize contribution of the project

## Problem description

Whay semantic slam accuracy is not good?

    a. Segmentation accuracy uncertainly

b. Dynamic or moving objects 

## What do we want to do?

Unify the two problem into a single framework by information theory - Entropy and Mutual Information.

    a.Entropy
Firstly, We use iterative probability product as the weights of each feature pairs in pose estimation. Then, generating the entropy by probablity to **evaluate the current observation quality.**

b.Mutual Information
We calculate the object 3D displacement between two frames, and generate mutual information of this objects. Then, we can use the mutual information to **estimate the space stability (namely, the dynamic state)**.