diff --git a/README.md b/README.md
new file mode 100644
index 0000000..b6ab7ef
--- /dev/null
+++ b/README.md
@@ -0,0 +1,136 @@
+Point Cloud Classification and Segmentation
+========================
+**Name: Omkar Chittar**
+**UID - 119193556**
+------------------------
+```
+PointCloud_Classification_and_Segmentation
++-checkpoints
++-data
++-logs
++-output
++-output_cls
++-output_cls_numpoints
++-output_cls_rotated
++-output_seg
++-output_seg_numpoints
++-output_seg_rotated
++-README.md
++-report
+data_loader.py
+eval_cls_numpoints.py
+eval_cls_rotated.py
+eval_cls.py
+eval_seg_numpoints.yml
+eval_seg_rotated.py
+eval_seg.py
+models.py
+train.py
+utils.py
+```
+
+# **Installation**
+
+- Download and extract the files.
+- Make sure you meet all the requirements given on: https://github.com/848f-3DVision/assignment2/tree/main
+- Or reinstall the necessary stuff using 'environment.yml':
+```bash
+conda env create -f environment.yml
+conda activate pytorch3d-env
+```
+## Data Preparation
+Download zip file (~2GB) from https://drive.google.com/file/d/1wXOgwM_rrEYJfelzuuCkRfMmR0J7vLq_/view?usp=sharing. Put the unzipped `data` folder under root directory. There are two folders (`cls` and `seg`) corresponding to two tasks, each of which contains `.npy` files for training and testing.
+- The **data** folder consists of all the data necessary for the code.
+- There are 6 output folders:
+ 1. **output_cls** folder has all the images/gifs generated after running ```eval_cls.py```.
+ 2. **output_seg** folder has all the images/gifs generated after running ```eval_seg.py```.
+ 3. **output_cls_numpoints** folder has all the images/gifs generated after running ```eval_cls_numpoints.py```.
+ 4. **output_seg_numpoints** folder has all the images/gifs generated after running ```eval_seg_numpoints.py```.
+ 5. **output_cls_rotated** folder has all the images/gifs generated after running ```eval_cls_rotated.py```.
+ 6. **output_seg_rotated** folder has all the images/gifs generated after running ```eval_seg_rotated.py```.
+- All the necessary instructions for running the code are given in **README.md**.
+- The folder **report** has the html file that leads to the webpage.
+
+
+# **1. Classification Model**
+- After making changes to:
+ 1. `models.py`
+ 2. `train.py`
+ 3. `eval_cls.py`
+
+Run the code:
+```bash
+python train.py --task cls
+```
+The code trains the model for the classification task.
+
+Evaluate the trained model by running the code:
+```bash
+python eval_cls.py
+```
+Evaluates the model for the classification task by rendering point clouds named with their ground truth class and their respective predicted class. Displays the accuracy of the trained model in the terminal. The rendered point clouds are saved in the **output_cls** folder.
+
+
+# **2. Segmentation Model**
+- After making changes to:
+ 1. `models.py`
+ 2. `train.py`
+ 3. `eval_seg.py`
+
+Run the code:
+```bash
+python train.py --task seg
+```
+The code trains the model for the Segmentation task.
+
+Evaluate the trained model by running the code:
+```bash
+python eval_seg.py
+```
+Evaluates the model for the Segmentation task by rendering point clouds with segmented areas with different colors. The rendered point clouds are saved in the **output_seg** folder. Displays the accuracy of the trained model in the terminal.
+
+
+# **3. Robustness Analysis**
+## **3.1. Rotating the point clouds**
+Here we try to evaluate the accuracy of the classification as well as the segmentation models by rotating the point clouds around any one axis (x/y/z) or their permutations.
+
+### 3.1.1. Classification
+Run the code:
+```bash
+python eval_cls_rotated.py
+```
+Evaluates the model with rotated inputs for the classification task by rendering point clouds named with their rotated angle, ground truth class and their respective predicted class. Displays the accuracy of the trained model in the terminal. The rendered point clouds are saved in the **output_cls_rotated** folder.
+
+### 3.1.2. Segmentation
+Run the code:
+```bash
+python eval_seg_rotated.py
+```
+Evaluates the model with rotated inputs for the segmentation task by rendering point clouds named with their rotated angle, and prediction accuracy. Displays the accuracy of the trained model in the terminal. The rendered point clouds are saved in the **output_seg_rotated** folder.
+
+
+## **3.2. Varying the sampled points in the point clouds**
+Here we try to evaluate the accuracy of the classification as well as the segmentation models by varying the number of sampled points in the point clouds.
+
+### 3.2.1. Classification
+Run the code:
+```bash
+python eval_cls_numpoints.py
+```
+Evaluates the model with varying number of sampled points inputs for the classification task by rendering point clouds named with their index, number of points, ground truth class and their respective predicted class. Displays the accuracy of the trained model in the terminal. The rendered point clouds are saved in the **output_cls_numpoints** folder.
+
+### 3.2.2. Segmentation
+Run the code:
+```bash
+python eval_seg_numpoints.py
+```
+Evaluates the model with varying number of sampled points inputs for the segmentation task by rendering point clouds named with their index, number of points, and their respective predicted class accuracy. Displays the accuracy of the trained model in the terminal. The rendered point clouds are saved in the **output_seg_numpoints** folder.
+
+
+# **4. Webpage**
+The html code for the webpage is stored in the *report* folder along with the images/gifs.
+Clicking on the *webpage.md.html* file will take you directly to the webpage.
+
+
+
+
diff --git a/data_loader.py b/data_loader.py
new file mode 100644
index 0000000..af115f0
--- /dev/null
+++ b/data_loader.py
@@ -0,0 +1,39 @@
+from torch.utils.data import DataLoader, Dataset
+import numpy as np
+import torch
+
+
+
+class CustomDataSet(Dataset):
+ """Load data under folders"""
+ def __init__(self, args, train=True):
+ self.main_dir = args.main_dir
+ self.task = args.task
+
+ if train:
+ data_path = self.main_dir + self.task + "/data_train.npy"
+ label_path = self.main_dir + self.task + "/label_train.npy"
+ else:
+ data_path = self.main_dir + self.task + "/data_test.npy"
+ label_path = self.main_dir + self.task + "/label_test.npy"
+
+ self.data = torch.from_numpy(np.load(data_path))
+ self.label = torch.from_numpy(np.load(label_path)).to(torch.long) # in cls task, (N,), in seg task, (N, 10000), N is the number of objects
+
+
+ def __len__(self):
+ return self.data.size()[0]
+
+ def __getitem__(self, idx):
+ return self.data[idx], self.label[idx]
+
+
+def get_data_loader(args, train=True):
+ """
+ Creates training and test data loaders
+ """
+ dataset = CustomDataSet(args=args, train=train)
+ dloader = DataLoader(dataset=dataset, batch_size=args.batch_size, shuffle=train, num_workers=args.num_workers)
+
+
+ return dloader
\ No newline at end of file
diff --git a/eval_cls.py b/eval_cls.py
new file mode 100644
index 0000000..a3cdf7a
--- /dev/null
+++ b/eval_cls.py
@@ -0,0 +1,103 @@
+import numpy as np
+import argparse
+
+import torch
+from models import cls_model
+from utils import create_dir, viz_cls
+from data_loader import get_data_loader
+
+import random
+import pytorch3d
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_cls_class', type=int, default=3, help='The number of classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--load_checkpoint', type=str, default='best_model') #model_epoch_0
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/cls/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/cls/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_cls')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="cls", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Classification Task ------
+ model = cls_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/cls/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print ("successfully loaded checkpoint from {}".format(model_path))
+
+ # Sample Points per Object
+ ind = np.random.choice(10000,args.num_points, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ correct_obj = 0
+ num_obj = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_obj += pred_labels.eq(labels.data).cpu().sum().item()
+ num_obj += labels.size()[0]
+
+ predictions.append(pred_labels)
+
+ accuracy = correct_obj / num_obj
+ print(f"test accuracy: {accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ # Visualize a few random test point clouds and failed test point clouds
+ fail_inds = torch.argwhere(predictions != test_dataloader.dataset.label)
+
+ for i in range(min(15, len(fail_inds))):
+ random_ind = random.randint(0, predictions.shape[0]-1)
+ while random_ind in fail_inds:
+ random_ind = random.randint(0, predictions.shape[0]-1)
+ verts = test_dataloader.dataset.data[random_ind, ind]
+ gt_cls = test_dataloader.dataset.label[random_ind].to(torch.long).detach().cpu().data
+ pred_cls = predictions[random_ind].detach().cpu().data
+
+ path = f"output_cls/1. random_vis_{random_ind}_with_gt_{gt_cls}_pred_{pred_cls}.gif"
+ viz_cls(verts, path, "cuda")
+
+ for i in range(len(fail_inds)):
+ fail_ind = fail_inds[i]
+ verts = test_dataloader.dataset.data[fail_ind, ind]
+ gt_cls = test_dataloader.dataset.label[fail_ind].detach().cpu().data
+ pred_cls = predictions[fail_ind].detach().cpu().data
+ path = f"output_cls/1.1 fail_vis_{fail_ind}_with_gt_{gt_cls}_pred_{pred_cls}.gif"
+ viz_cls(verts, path, "cuda")
+
+ print(f"test accuracy: {accuracy}")
\ No newline at end of file
diff --git a/eval_cls_numpoints.py b/eval_cls_numpoints.py
new file mode 100644
index 0000000..7e762ce
--- /dev/null
+++ b/eval_cls_numpoints.py
@@ -0,0 +1,92 @@
+import numpy as np
+import argparse
+
+import torch
+from models import cls_model
+from utils import create_dir, viz_cls
+from data_loader import get_data_loader
+
+import random
+import pytorch3d
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_cls_class', type=int, default=3, help='The number of classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--load_checkpoint', type=str, default='best_model') #model_epoch_0
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/cls/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/cls/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_cls_numpoints')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="cls", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Classification Task ------
+ model = cls_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/cls/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print ("successfully loaded checkpoint from {}".format(model_path))
+
+ index = [94, 702, 870]
+
+ for j in index:
+ n_points = [10, 50, 100, 500, 1000, 10000]
+
+ for n in n_points:
+ # Sample Points per Object
+ ind = np.random.choice(10000, n, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ correct_obj = 0
+ num_obj = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_obj += pred_labels.eq(labels.data).cpu().sum().item()
+ num_obj += labels.size()[0]
+
+ predictions.append(pred_labels)
+
+ accuracy = correct_obj / num_obj
+ print(f"test accuracy for num of points {n} : {accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ verts = test_dataloader.dataset.data[j, ind] # change j to args.i for a particular index visualization
+ gt_cls = test_dataloader.dataset.label[j].to(torch.long).detach().cpu().data
+ pred_cls = predictions[j].detach().cpu().data
+
+ path = f"output_cls_numpoints/3.2. vis_{j}_numpoints_{n}_with_gt_{gt_cls}_pred_{pred_cls}_acc_{accuracy}.gif"
+ viz_cls(verts, path, "cuda")
\ No newline at end of file
diff --git a/eval_cls_rotated.py b/eval_cls_rotated.py
new file mode 100644
index 0000000..64cc41b
--- /dev/null
+++ b/eval_cls_rotated.py
@@ -0,0 +1,122 @@
+import numpy as np
+import argparse
+
+import torch
+from models import cls_model
+from utils import create_dir, viz_cls
+from data_loader import get_data_loader
+
+import pytorch3d
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_cls_class', type=int, default=3, help='The number of classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--load_checkpoint', type=str, default='best_model') #model_epoch_0
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/cls/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/cls/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_cls_rotated')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="cls", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Classification Task ------
+ model = cls_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/cls/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print ("successfully loaded checkpoint from {}".format(model_path))
+
+ # Sample Points per Object
+ ind = np.random.choice(10000,args.num_points, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ index = [94, 702, 870]
+
+ for j in index:
+ for theta in range(0, 91, 10):
+ # Rotation
+ rot = torch.tensor([theta,0,0])
+ R = pytorch3d.transforms.euler_angles_to_matrix(rot, 'XYZ')
+ test_dataloader.dataset.data = (R @ test_dataloader.dataset.data.transpose(1, 2)).transpose(1, 2)
+ rad = torch.Tensor([theta * np.pi / 180.])[0]
+
+ # # rotation around x-axis
+ # R_x = torch.Tensor([[1, 0, 0],
+ # [0, torch.cos(rad), - torch.sin(rad)],
+ # [0, torch.sin(rad), torch.cos(rad)]])
+ R_x = torch.Tensor([[1, 0, 0],
+ [0, 1, 0],
+ [0, 0, 1]])
+
+ # rotation around y-axis
+ R_y = torch.Tensor([[1, 0, 0],
+ [0, 1, 0],
+ [0, 0, 1]])
+ # R_y = torch.Tensor([[torch.cos(rad), 0, torch.sin(rad)],
+ # [0, 1, 0],
+ # [- torch.sin(rad), 0, torch.cos(rad)]])
+
+ # # rotation around z-axis
+ # R_z = torch.Tensor([[1, 0, 0],
+ # [0, 1, 0],
+ # [0, 0, 1]])
+ R_z = torch.Tensor([[torch.cos(rad), - torch.sin(rad), 0],
+ [torch.sin(rad), torch.cos(rad), 0],
+ [0, 0, 1]])
+
+ test_dataloader.dataset.data = ((R_x @ R_y @ R_z) @ test_dataloader.dataset.data.transpose(1, 2)).transpose(1, 2)
+
+
+ correct_obj = 0
+ num_obj = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_obj += pred_labels.eq(labels.data).cpu().sum().item()
+ num_obj += labels.size()[0]
+
+ predictions.append(pred_labels)
+
+ accuracy = correct_obj / num_obj
+ print(f"test accuracy for angle {theta} : {accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ verts = test_dataloader.dataset.data[j, ind] # change j to args.i for a particular index visualization
+ gt_cls = test_dataloader.dataset.label[j].to(torch.long).detach().cpu().data
+ pred_cls = predictions[j].detach().cpu().data
+
+ path = f"output_cls_rotated/3.1. vis_{j}_angle_{theta}_with_gt_{gt_cls}_pred_{pred_cls}_acc_{accuracy}.gif"
+ viz_cls(verts, path, "cuda")
\ No newline at end of file
diff --git a/eval_seg.py b/eval_seg.py
new file mode 100644
index 0000000..a77c79b
--- /dev/null
+++ b/eval_seg.py
@@ -0,0 +1,105 @@
+import numpy as np
+import argparse
+
+import torch
+from models import seg_model
+from data_loader import get_data_loader
+from utils import create_dir, viz_seg
+from data_loader import get_data_loader
+
+import random
+import pytorch3d
+
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_seg_class', type=int, default=6, help='The number of segmentation classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ parser.add_argument('--load_checkpoint', type=str, default='best_model')
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/seg/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/seg/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_seg')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="seg", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Segmentation Task ------
+ model = seg_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/seg/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print ("successfully loaded checkpoint from {}".format(model_path))
+
+ # Sample Points per Object
+ ind = np.random.choice(10000,args.num_points, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ correct_point = 0
+ num_point = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels[:,ind].to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_point += pred_labels.eq(labels.data).cpu().sum().item()
+ num_point += labels.view([-1,1]).size()[0]
+
+ predictions.append(pred_labels)
+
+ test_accuracy = correct_point / num_point
+ print(f"test accuracy: {test_accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ for i in range(15):
+ random_ind = random.randint(0, predictions.shape[0]-1)
+ verts = test_dataloader.dataset.data[random_ind, ind].detach().cpu()
+ labels = test_dataloader.dataset.label[random_ind, ind].to(torch.long).detach().cpu()
+
+ correct_point = predictions[random_ind].eq(labels.data).cpu().sum().item()
+ num_point = labels.view([-1,1]).size()[0]
+ accuracy = correct_point / num_point
+
+ viz_seg(verts, labels, "{}/2. obj_index_{}_gt_{}.gif".format(args.output_dir, random_ind, args.exp_name), args.device)
+ viz_seg(verts, labels, "{}/2. obj_index_{}_pred_{}_acc_{}.gif".format(args.output_dir, random_ind, args.exp_name, accuracy), args.device)
+
+ random_ind = 289
+ verts = test_dataloader.dataset.data[random_ind, ind].detach().cpu()
+ labels = test_dataloader.dataset.label[random_ind, ind].to(torch.long).detach().cpu()
+
+ correct_point = predictions[random_ind].eq(labels.data).cpu().sum().item()
+ num_point = labels.view([-1,1]).size()[0]
+ accuracy = correct_point / num_point
+
+ viz_seg(verts, labels, "{}/2. obj_index_{}_gt_{}.gif".format(args.output_dir, random_ind, args.exp_name), args.device)
+ viz_seg(verts, predictions[random_ind], "{}/2. obj_index_{}_pred_{}_acc_{}.gif".format(args.output_dir, random_ind, args.exp_name, accuracy), args.device)
+
+ print(f"test accuracy: {test_accuracy}")
\ No newline at end of file
diff --git a/eval_seg_numpoints.py b/eval_seg_numpoints.py
new file mode 100644
index 0000000..90bc132
--- /dev/null
+++ b/eval_seg_numpoints.py
@@ -0,0 +1,90 @@
+import numpy as np
+import argparse
+
+import torch
+from models import seg_model
+from utils import create_dir, viz_seg
+from data_loader import get_data_loader
+
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_seg_class', type=int, default=6, help='The number of segmentation classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--load_checkpoint', type=str, default='best_model') # model_epoch_0
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/seg/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/seg/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_seg_numpoints')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="seg", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Segmentation Task ------
+ model = seg_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/seg/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print("successfully loaded checkpoint from {}".format(model_path))
+
+ index = [20, 168, 96]
+
+ for j in index:
+ n_points = [10, 50, 100, 500, 1000, 10000]
+
+ for n in n_points:
+ # Sample Points per Object
+ ind = np.random.choice(10000, n, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ correct_point = 0
+ num_point = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels[:, ind].to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_point += pred_labels.eq(labels.data).cpu().sum().item()
+ num_point += labels.view([-1, 1]).size()[0]
+
+ predictions.append(pred_labels)
+
+ test_accuracy = correct_point / num_point
+ print(f"test accuracy for num of points {n} : {test_accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ verts = test_dataloader.dataset.data[j, ind].detach().cpu()
+ labels = test_dataloader.dataset.label[j, ind].to(torch.long).detach().cpu()
+
+ viz_seg(verts, labels, "{}/3.2. vis_{}_numpoints_{}_gt_{}.gif".format(args.output_dir, j, n, args.exp_name), args.device)
+ viz_seg(verts, predictions[j], "{}/3.2. vis_{}_numpoints_{}_pred_{}_acc_{}.gif".format(args.output_dir, j, n, args.exp_name, test_accuracy), args.device)
+
diff --git a/eval_seg_rotated.py b/eval_seg_rotated.py
new file mode 100644
index 0000000..c896fe6
--- /dev/null
+++ b/eval_seg_rotated.py
@@ -0,0 +1,128 @@
+import numpy as np
+import argparse
+
+import torch
+from models import seg_model
+from data_loader import get_data_loader
+from utils import create_dir, viz_seg
+from data_loader import get_data_loader
+
+import pytorch3d
+
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--num_seg_class', type=int, default=6, help='The number of segmentation classes')
+ parser.add_argument('--num_points', type=int, default=10000, help='The number of points per object to be included in the input data')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--load_checkpoint', type=str, default='best_model') # model_epoch_0
+ parser.add_argument('--i', type=int, default=0, help="index of the object to visualize")
+
+ parser.add_argument('--test_data', type=str, default='./data/seg/data_test.npy')
+ parser.add_argument('--test_label', type=str, default='./data/seg/label_test.npy')
+ parser.add_argument('--output_dir', type=str, default='./output_seg_rotated')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--task', type=str, default="seg", help='The task: cls or seg')
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+
+ create_dir(args.output_dir)
+
+ # ------ TO DO: Initialize Model for Segmentation Task ------
+ model = seg_model().to(args.device)
+
+ # Load Model Checkpoint
+ model_path = './checkpoints/seg/{}.pt'.format(args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ model.eval()
+ print("successfully loaded checkpoint from {}".format(model_path))
+
+ # Sample Points per Object
+ ind = np.random.choice(10000, args.num_points, replace=False)
+
+ # ------ TO DO: Make Prediction ------
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ index = [20, 168, 96]
+
+ for j in index:
+ # Rotation
+ for theta in range(0, 91, 10):
+ # Rotation
+ rot = torch.tensor([theta, 0, 0])
+ R = pytorch3d.transforms.euler_angles_to_matrix(rot, 'XYZ')
+ test_dataloader.dataset.data = (R @ test_dataloader.dataset.data.transpose(1, 2)).transpose(1, 2)
+ rad = torch.Tensor([theta * np.pi / 180.])[0]
+
+ # rotation around x-axis
+ R_x = torch.Tensor([[1, 0, 0],
+ [0, 1, 0],
+ [0, 0, 1]])
+ # R_x = torch.Tensor([[1, 0, 0],
+ # [0, torch.cos(rad), - torch.sin(rad)],
+ # [0, torch.sin(rad), torch.cos(rad)]])
+
+ # rotation around y-axis
+ R_y = torch.Tensor([[1, 0, 0],
+ [0, 1, 0],
+ [0, 0, 1]])
+ # R_y = torch.Tensor([[torch.cos(rad), 0, torch.sin(rad)],
+ # [0, 1, 0],
+ # [- torch.sin(rad), 0, torch.cos(rad)]])
+
+ # rotation around z-axis
+ R_z = torch.Tensor([[torch.cos(rad), - torch.sin(rad), 0],
+ [torch.sin(rad), torch.cos(rad), 0],
+ [0, 0, 1]])
+ # R_z = torch.Tensor([[torch.cos(rad), - torch.sin(rad), 0],
+ # [torch.sin(rad), torch.cos(rad), 0],
+ # [0, 0, 1]])
+
+ test_dataloader.dataset.data = ((R_x @ R_y @ R_z) @ test_dataloader.dataset.data.transpose(1, 2)).transpose(1, 2)
+
+ correct_point = 0
+ num_point = 0
+ predictions = []
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds[:, ind].to(args.device)
+ labels = labels[:, ind].to(args.device).to(torch.long)
+
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_point += pred_labels.eq(labels.data).cpu().sum().item()
+ num_point += labels.view([-1, 1]).size()[0]
+
+ predictions.append(pred_labels)
+
+ test_accuracy = correct_point / num_point
+ print(f"test accuracy for angle {theta} : {test_accuracy}")
+ predictions = torch.cat(predictions).detach().cpu()
+
+ verts = test_dataloader.dataset.data[j, ind].detach().cpu()
+ labels = test_dataloader.dataset.label[j, ind].to(torch.long).detach().cpu()
+
+ correct_point = predictions[j].eq(labels.data).cpu().sum().item()
+ num_point = labels.view([-1, 1]).size()[0]
+ accuracy = correct_point / num_point
+
+ viz_seg(verts, labels, "{}/3.1. vis_{}_angle_{}_gt_{}.gif".format(args.output_dir, j, theta, args.exp_name), args.device)
+ viz_seg(verts, predictions[j], "{}/3.1. vis_{}_angle_{}_pred_{}_acc_{}.gif".format(args.output_dir, j, theta, args.exp_name, accuracy), args.device)
+
diff --git a/models.py b/models.py
new file mode 100644
index 0000000..bb8c2de
--- /dev/null
+++ b/models.py
@@ -0,0 +1,99 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+# ------ TO DO ------
+class cls_model(nn.Module):
+ def __init__(self, num_classes=3):
+ super(cls_model, self).__init__()
+ self.conv1 = nn.Conv1d(3, 64, 1)
+ self.conv2 = nn.Conv1d(64, 64, 1)
+ self.conv3 = nn.Conv1d(64, 128, 1)
+ self.conv4 = nn.Conv1d(128, 1024, 1)
+
+ self.bn1 = nn.BatchNorm1d(64)
+ self.bn2 = nn.BatchNorm1d(64)
+ self.bn3 = nn.BatchNorm1d(128)
+ self.bn4 = nn.BatchNorm1d(1024)
+
+ self.fc = nn.Sequential(
+ nn.Linear(1024, 512),
+ nn.BatchNorm1d(512),
+ nn.ReLU(),
+ nn.Dropout(p=0.3),
+ nn.Linear(512, 256),
+ nn.BatchNorm1d(256),
+ nn.ReLU(),
+ nn.Dropout(p=0.3),
+ nn.Linear(256, num_classes)
+ )
+
+ def forward(self, points):
+ '''
+ points: tensor of size (B, N, 3)
+ , where B is batch size and N is the number of points per object (N=10000 by default)
+ output: tensor of size (B, num_classes)
+ '''
+ points = points.transpose(1, 2)
+
+ out = F.relu(self.bn1(self.conv1(points)))
+ out = F.relu(self.bn2(self.conv2(out)))
+ out = F.relu(self.bn3(self.conv3(out)))
+ out = F.relu(self.bn4(self.conv4(out)))
+
+ # max pool
+ out = torch.amax(out, dim=-1)
+
+ out = self.fc(out)
+
+ return out
+
+
+# ------ TO DO ------
+class seg_model(nn.Module):
+ def __init__(self, num_seg_classes = 6):
+ super(seg_model, self).__init__()
+ self.conv1 = nn.Conv1d(3, 64, 1)
+ self.conv2 = nn.Conv1d(64, 64, 1)
+ self.conv3 = nn.Conv1d(64, 128, 1)
+ self.conv4 = nn.Conv1d(128, 1024, 1)
+
+ self.bn1 = nn.BatchNorm1d(64)
+ self.bn2 = nn.BatchNorm1d(64)
+ self.bn3 = nn.BatchNorm1d(128)
+ self.bn4 = nn.BatchNorm1d(1024)
+
+ self.point_layer = nn.Sequential(
+ nn.Conv1d(1088, 512, 1),
+ nn.BatchNorm1d(512),
+ nn.ReLU(),
+ nn.Conv1d(512, 256, 1),
+ nn.BatchNorm1d(256),
+ nn.ReLU(),
+ nn.Conv1d(256, 128, 1),
+ nn.BatchNorm1d(128),
+ nn.ReLU(),
+ nn.Conv1d(128, num_seg_classes, 1),
+ )
+
+
+ def forward(self, points):
+ '''
+ points: tensor of size (B, N, 3)
+ , where B is batch size and N is the number of points per object (N=10000 by default)
+ output: tensor of size (B, N, num_seg_classes)
+ '''
+ N = points.shape[1]
+ points = points.transpose(1, 2)
+
+ local_out = F.relu(self.bn1(self.conv1(points)))
+ local_out = F.relu(self.bn2(self.conv2(local_out)))
+
+ global_out = F.relu(self.bn3(self.conv3(local_out)))
+ global_out = F.relu(self.bn4(self.conv4(global_out)))
+ global_out = torch.amax(global_out, dim=-1, keepdims=True).repeat(1, 1, N)
+
+ out = torch.cat((local_out, global_out), dim=1)
+ out = self.point_layer(out).transpose(1, 2)
+
+ return out
\ No newline at end of file
diff --git a/output.zip b/output.zip
new file mode 100644
index 0000000..393515a
Binary files /dev/null and b/output.zip differ
diff --git a/output_cls.zip b/output_cls.zip
new file mode 100644
index 0000000..4ae3b6e
Binary files /dev/null and b/output_cls.zip differ
diff --git a/output_cls_numpoints.zip b/output_cls_numpoints.zip
new file mode 100644
index 0000000..7a58223
Binary files /dev/null and b/output_cls_numpoints.zip differ
diff --git a/output_cls_rotated.zip b/output_cls_rotated.zip
new file mode 100644
index 0000000..5cbfe04
Binary files /dev/null and b/output_cls_rotated.zip differ
diff --git a/output_seg_numpoints.zip b/output_seg_numpoints.zip
new file mode 100644
index 0000000..142891e
Binary files /dev/null and b/output_seg_numpoints.zip differ
diff --git a/output_seg_rotated.zip b/output_seg_rotated.zip
new file mode 100644
index 0000000..3880fae
Binary files /dev/null and b/output_seg_rotated.zip differ
diff --git a/report/1.gif b/report/1.gif
new file mode 100644
index 0000000..2c73d69
Binary files /dev/null and b/report/1.gif differ
diff --git a/report/10.gif b/report/10.gif
new file mode 100644
index 0000000..2076d22
Binary files /dev/null and b/report/10.gif differ
diff --git a/report/11.gif b/report/11.gif
new file mode 100644
index 0000000..8fd0a47
Binary files /dev/null and b/report/11.gif differ
diff --git a/report/12.gif b/report/12.gif
new file mode 100644
index 0000000..1574b4a
Binary files /dev/null and b/report/12.gif differ
diff --git a/report/13.gif b/report/13.gif
new file mode 100644
index 0000000..8de4763
Binary files /dev/null and b/report/13.gif differ
diff --git a/report/14.gif b/report/14.gif
new file mode 100644
index 0000000..6fe7dbe
Binary files /dev/null and b/report/14.gif differ
diff --git a/report/15.gif b/report/15.gif
new file mode 100644
index 0000000..817a17f
Binary files /dev/null and b/report/15.gif differ
diff --git a/report/16.gif b/report/16.gif
new file mode 100644
index 0000000..ac9322d
Binary files /dev/null and b/report/16.gif differ
diff --git a/report/17.gif b/report/17.gif
new file mode 100644
index 0000000..6266782
Binary files /dev/null and b/report/17.gif differ
diff --git a/report/18.gif b/report/18.gif
new file mode 100644
index 0000000..a294200
Binary files /dev/null and b/report/18.gif differ
diff --git a/report/19.gif b/report/19.gif
new file mode 100644
index 0000000..fed338e
Binary files /dev/null and b/report/19.gif differ
diff --git a/report/2.gif b/report/2.gif
new file mode 100644
index 0000000..dd89a88
Binary files /dev/null and b/report/2.gif differ
diff --git a/report/20.gif b/report/20.gif
new file mode 100644
index 0000000..16ca163
Binary files /dev/null and b/report/20.gif differ
diff --git a/report/21.gif b/report/21.gif
new file mode 100644
index 0000000..a649a44
Binary files /dev/null and b/report/21.gif differ
diff --git a/report/22.gif b/report/22.gif
new file mode 100644
index 0000000..a649a44
Binary files /dev/null and b/report/22.gif differ
diff --git a/report/23.gif b/report/23.gif
new file mode 100644
index 0000000..b699fe6
Binary files /dev/null and b/report/23.gif differ
diff --git a/report/24.gif b/report/24.gif
new file mode 100644
index 0000000..4c58c97
Binary files /dev/null and b/report/24.gif differ
diff --git a/report/25.gif b/report/25.gif
new file mode 100644
index 0000000..9660018
Binary files /dev/null and b/report/25.gif differ
diff --git a/report/26.gif b/report/26.gif
new file mode 100644
index 0000000..f26121a
Binary files /dev/null and b/report/26.gif differ
diff --git a/report/27.gif b/report/27.gif
new file mode 100644
index 0000000..367bb17
Binary files /dev/null and b/report/27.gif differ
diff --git a/report/28.gif b/report/28.gif
new file mode 100644
index 0000000..ce6be22
Binary files /dev/null and b/report/28.gif differ
diff --git a/report/29.gif b/report/29.gif
new file mode 100644
index 0000000..52ce0ef
Binary files /dev/null and b/report/29.gif differ
diff --git a/report/3.gif b/report/3.gif
new file mode 100644
index 0000000..21f6a70
Binary files /dev/null and b/report/3.gif differ
diff --git a/report/30.gif b/report/30.gif
new file mode 100644
index 0000000..aa53c38
Binary files /dev/null and b/report/30.gif differ
diff --git a/report/31.gif b/report/31.gif
new file mode 100644
index 0000000..8cde583
Binary files /dev/null and b/report/31.gif differ
diff --git a/report/32.gif b/report/32.gif
new file mode 100644
index 0000000..bfa7c83
Binary files /dev/null and b/report/32.gif differ
diff --git a/report/33.gif b/report/33.gif
new file mode 100644
index 0000000..f1d446c
Binary files /dev/null and b/report/33.gif differ
diff --git a/report/34.gif b/report/34.gif
new file mode 100644
index 0000000..20bbe7e
Binary files /dev/null and b/report/34.gif differ
diff --git a/report/35.gif b/report/35.gif
new file mode 100644
index 0000000..33698e9
Binary files /dev/null and b/report/35.gif differ
diff --git a/report/36.gif b/report/36.gif
new file mode 100644
index 0000000..19b4077
Binary files /dev/null and b/report/36.gif differ
diff --git a/report/37.gif b/report/37.gif
new file mode 100644
index 0000000..880485a
Binary files /dev/null and b/report/37.gif differ
diff --git a/report/38.gif b/report/38.gif
new file mode 100644
index 0000000..2d9da03
Binary files /dev/null and b/report/38.gif differ
diff --git a/report/39.gif b/report/39.gif
new file mode 100644
index 0000000..a41caf3
Binary files /dev/null and b/report/39.gif differ
diff --git a/report/4.gif b/report/4.gif
new file mode 100644
index 0000000..057e614
Binary files /dev/null and b/report/4.gif differ
diff --git a/report/40.gif b/report/40.gif
new file mode 100644
index 0000000..521ca3c
Binary files /dev/null and b/report/40.gif differ
diff --git a/report/41.gif b/report/41.gif
new file mode 100644
index 0000000..dcc0bb4
Binary files /dev/null and b/report/41.gif differ
diff --git a/report/42.gif b/report/42.gif
new file mode 100644
index 0000000..3f8551f
Binary files /dev/null and b/report/42.gif differ
diff --git a/report/43.gif b/report/43.gif
new file mode 100644
index 0000000..206fecf
Binary files /dev/null and b/report/43.gif differ
diff --git a/report/44.gif b/report/44.gif
new file mode 100644
index 0000000..6cf7b32
Binary files /dev/null and b/report/44.gif differ
diff --git a/report/45.gif b/report/45.gif
new file mode 100644
index 0000000..ccef9a4
Binary files /dev/null and b/report/45.gif differ
diff --git a/report/46.gif b/report/46.gif
new file mode 100644
index 0000000..2a6c39a
Binary files /dev/null and b/report/46.gif differ
diff --git a/report/47.gif b/report/47.gif
new file mode 100644
index 0000000..b54df36
Binary files /dev/null and b/report/47.gif differ
diff --git a/report/48.gif b/report/48.gif
new file mode 100644
index 0000000..a01a8d3
Binary files /dev/null and b/report/48.gif differ
diff --git a/report/49.gif b/report/49.gif
new file mode 100644
index 0000000..3febbc5
Binary files /dev/null and b/report/49.gif differ
diff --git a/report/5.gif b/report/5.gif
new file mode 100644
index 0000000..a46a115
Binary files /dev/null and b/report/5.gif differ
diff --git a/report/50.gif b/report/50.gif
new file mode 100644
index 0000000..353c4a4
Binary files /dev/null and b/report/50.gif differ
diff --git a/report/51.gif b/report/51.gif
new file mode 100644
index 0000000..0814d77
Binary files /dev/null and b/report/51.gif differ
diff --git a/report/52.gif b/report/52.gif
new file mode 100644
index 0000000..9660018
Binary files /dev/null and b/report/52.gif differ
diff --git a/report/53.gif b/report/53.gif
new file mode 100644
index 0000000..9453ddf
Binary files /dev/null and b/report/53.gif differ
diff --git a/report/54.gif b/report/54.gif
new file mode 100644
index 0000000..ff39791
Binary files /dev/null and b/report/54.gif differ
diff --git a/report/55.gif b/report/55.gif
new file mode 100644
index 0000000..2a44459
Binary files /dev/null and b/report/55.gif differ
diff --git a/report/56.gif b/report/56.gif
new file mode 100644
index 0000000..52ce0ef
Binary files /dev/null and b/report/56.gif differ
diff --git a/report/57.gif b/report/57.gif
new file mode 100644
index 0000000..eec504e
Binary files /dev/null and b/report/57.gif differ
diff --git a/report/58.gif b/report/58.gif
new file mode 100644
index 0000000..177670b
Binary files /dev/null and b/report/58.gif differ
diff --git a/report/59.gif b/report/59.gif
new file mode 100644
index 0000000..81b9e83
Binary files /dev/null and b/report/59.gif differ
diff --git a/report/6.gif b/report/6.gif
new file mode 100644
index 0000000..1bd67b2
Binary files /dev/null and b/report/6.gif differ
diff --git a/report/60.gif b/report/60.gif
new file mode 100644
index 0000000..f1d446c
Binary files /dev/null and b/report/60.gif differ
diff --git a/report/61.gif b/report/61.gif
new file mode 100644
index 0000000..8a63601
Binary files /dev/null and b/report/61.gif differ
diff --git a/report/62.gif b/report/62.gif
new file mode 100644
index 0000000..012df49
Binary files /dev/null and b/report/62.gif differ
diff --git a/report/63.gif b/report/63.gif
new file mode 100644
index 0000000..7108e41
Binary files /dev/null and b/report/63.gif differ
diff --git a/report/64.gif b/report/64.gif
new file mode 100644
index 0000000..880485a
Binary files /dev/null and b/report/64.gif differ
diff --git a/report/65.gif b/report/65.gif
new file mode 100644
index 0000000..11854ff
Binary files /dev/null and b/report/65.gif differ
diff --git a/report/66.gif b/report/66.gif
new file mode 100644
index 0000000..c6f1f66
Binary files /dev/null and b/report/66.gif differ
diff --git a/report/67.gif b/report/67.gif
new file mode 100644
index 0000000..e669515
Binary files /dev/null and b/report/67.gif differ
diff --git a/report/68.gif b/report/68.gif
new file mode 100644
index 0000000..dcc0bb4
Binary files /dev/null and b/report/68.gif differ
diff --git a/report/69.gif b/report/69.gif
new file mode 100644
index 0000000..6532286
Binary files /dev/null and b/report/69.gif differ
diff --git a/report/7.gif b/report/7.gif
new file mode 100644
index 0000000..7ca14f6
Binary files /dev/null and b/report/7.gif differ
diff --git a/report/70.gif b/report/70.gif
new file mode 100644
index 0000000..c6f1f66
Binary files /dev/null and b/report/70.gif differ
diff --git a/report/71.gif b/report/71.gif
new file mode 100644
index 0000000..e669515
Binary files /dev/null and b/report/71.gif differ
diff --git a/report/72.gif b/report/72.gif
new file mode 100644
index 0000000..dcc0bb4
Binary files /dev/null and b/report/72.gif differ
diff --git a/report/73.gif b/report/73.gif
new file mode 100644
index 0000000..6532286
Binary files /dev/null and b/report/73.gif differ
diff --git a/report/74.gif b/report/74.gif
new file mode 100644
index 0000000..cdb3ba1
Binary files /dev/null and b/report/74.gif differ
diff --git a/report/75.gif b/report/75.gif
new file mode 100644
index 0000000..453b262
Binary files /dev/null and b/report/75.gif differ
diff --git a/report/76.gif b/report/76.gif
new file mode 100644
index 0000000..ccef9a4
Binary files /dev/null and b/report/76.gif differ
diff --git a/report/8.gif b/report/8.gif
new file mode 100644
index 0000000..3155ecf
Binary files /dev/null and b/report/8.gif differ
diff --git a/report/9.gif b/report/9.gif
new file mode 100644
index 0000000..26e707e
Binary files /dev/null and b/report/9.gif differ
diff --git a/report/webpage.md.html b/report/webpage.md.html
new file mode 100644
index 0000000..07cb4f5
--- /dev/null
+++ b/report/webpage.md.html
@@ -0,0 +1,177 @@
+
+**CMSC848F Assignment 4: Point Cloud Classification and Segmentation**
+
+Name: Omkar Chittar
+
+UID: 119193556
+
+
+Classification Model
+===============================================================================
+I implemented the PointNet architecture.
+
+Run:
+```bash
+python train.py --task cls
+```
+for training the classification model.
+
+Run:
+```bash
+python eval_cls.py
+```
+for evaluating the trained model.
+
+The test accuracy of the model is stored as **`best_model.pt`** in the **`./checkpoints/cls`** folder and has a value **0.9769**
+
+## Results
+
+### Correct Classifications
+
+
+| Point Cloud| | | Ground truth Class | Predicted Class |
+|:-----------|------------|:----------:|--------------------|-----------------|
+|  |  |  | Chair | Chair |
+|  |  |  | Vase | Vase |
+|  |  |  | Lamp | Lamp |
+
+
+### Incorrect Classifications
+
+
+| Point Cloud | Ground Truth Class | Predicted Class |
+|:-----------:|:------------------:|:---------------:|
+|  | Chair | Lamp |
+|  | Vase | Lamp |
+|  | Lamp | Vase |
+
+Analysis
+-------------------------------------------------------------------------------
+
+The misclassifications made by the PointNet model on the few failure cases seem to be due to those examples deviating significantly from the norm for their respective categories. For instance, the misclassified chair examples have unusual or atypical designs - one is folded up and missing a seat, while the other is unusually tall. Similarly, some of the misclassified vases and lamps have shapes that overlap more with the opposing class.
+Additionally, the chair class appears to have less shape variety overall compared to vases and lamps. Chairs components tend to be more standardized (seat, legs, back, etc). In contrast, the vase and lamp categories exhibit greater diversity in proportions and silhouettes (floor lamps vs desk lamps, vases with or without flowers etc). The model's confusion between these two classes likely stems from their greater morphological similarity in many cases - symmetry about the vertical axis, cylindrical profiles etc.
+
+
+Segmentation Model
+===============================================================================
+I implemented the PointNet architecture.
+
+Run:
+```bash
+python train.py --task seg
+```
+for training the segmentation model.
+
+Run:
+```bash
+python eval_seg.py
+```
+for evaluating the trained model.
+
+The test accuracy of the model is stored as **`best_model.pt`** in the **`./checkpoints/seg`** folder and has a value **0.9022**
+
+
+## Results
+
+### Good Predictions
+
+| Ground truth point cloud | Predicted point cloud | Accuracy |
+|:------------------:|:---------------:|:--------------------:|
+|  |  | 0.9836 |
+|  |  | 0.9237 |
+|  |  | 0.917 |
+
+### Bad Predictions
+
+| Ground truth point cloud | Predicted point cloud | Accuracy |
+|:------------------:|:---------------:|:--------------------:|
+|  |  | 0.5171 |
+|  |  | 0.4776 |
+|  |  | 0.5126 |
+
+Analysis:
+-------------------------------------------------------------------------------
+
+The model struggles to accurately segment sofa-like chairs where the boundaries between components like the back, headrest, armrests, seat and legs are less defined. The blending of these parts without clear delineation poses a challenge. Similarly, chairs with highly irregular or atypical shapes and geometries also confuse the model as they deviate significantly from the distribution of point clouds seen during training.
+On the other hand, the model performs very well in segmenting chairs with distinct, well-separated components like a distinct back, seat, separable arm rests and discrete legs. Chairs that have intricate details or accessories that overlap multiple segments, like a pillow over the seat and back, trip up the model. In such cases, there is often bleeding between segments, with the model unable to constrain a larger segment from encroaching on adjacent smaller segments.
+
+
+Robustness Analysis
+===============================================================================
+
+Robustness against Rotation
+-------------------------------------------------------------------------------
+Rotate each evaluation point cloud around x-axis for 30, 60 and 90 degrees.
+I have written the code to loop over specific object indices and while looping over various thetas (angles).
+
+### Classification
+Run the code:
+```bash
+python eval_cls_rotated.py
+```
+
+| Class | Ground Truth | 30 deg | 60 deg | 90 deg |
+|:-----:|:------------:|:------:|:------:|:------:|
+| Chair |  |  |  |  |
+| Vase |  |  |  |  |
+| Lamp |  |  |  |  |
+| Test Accuracy | 0.9769 | 0.7992 | 0.2235 | 0.3012 |
+
+
+### Segmentation
+Run the code:
+```bash
+python eval_seg_rotated.py
+```
+
+| | 0 deg | 30 deg | 60 deg | 90 deg |
+|--|:------------:|:------:|:------:|:------:|
+| |  |  |  |  |
+| |  |  |  |  |
+| |  |  |  |  |
+| Test Accuracy | 0.9022 | 0.7992 | 0.399 | 0.1319 |
+
+### Analysis
+
+The model struggles to make accurate predictions when the point cloud is rotated dramatically away from an upright orientation. This limitation is likely due to the lack of data augmentation during training to include non-upright point cloud configurations. Without exposure to rotated variants of the object classes, the model fails to generalize to point clouds that deviate hugely from the expected upright positioning seen in the training data. Incorporating point cloud rotations during training data generation would likely improve the model's ability to recognize and segment objects despite major shifts in orientation. By augmenting the data to simulate tilted, skewed or even completely inverted point clouds, the model could become invariant to orientation and handle such cases gracefully during prediction.
+
+
+Robustness against Number of Points
+-------------------------------------------------------------------------------
+Evaluate the model with varying number of sampled points.
+I have written the code to loop over specific object indices and while looping over various num_points.
+
+### Classification
+Run the code:
+```bash
+python eval_cls_numpoints.py
+```
+
+| Class | 10 | 100 | 1000 | 10000 |
+|:-----:|:------------:|:------:|:------:|:------:|
+| Chair |  |  |  |  |
+| Vase |  |  |  |  |
+| Lamp |  |  |  |  |
+| Test Accuracy | 0.5012 | 0.8255 | 0.8992 | 0.9769 |
+
+
+### Segmentation
+Run the code:
+```bash
+python eval_seg_numpoints.py
+```
+
+| | 10 | 100 | 1000 | 10000 |
+|--|:------------:|:------:|:------:|:------:|
+| |  |  |  |  |
+| |  |  |  |  |
+| |  |  |  |  |
+| Test Accuracy | 0.4673 | 0.7992 | 0.8599 | 0.9022 |
+
+### Analysis
+
+The model demonstrates considerable robustness to sparsity in the point cloud inputs. With as few as 10 points, it can achieve 25% test accuracy, rising rapidly to 80% accuracy with only 50 points. This suggests the model is able to infer the correct shape from even a very sparse sampling of points. However, its ability to generalize from such limited information may be challenged as the number of classes increases. Discriminating between more categories with fewer representative points could lower the accuracy, despite the architectural design choices to promote invariance to input sparsity. There may be a threshold minimum density below which the lack of key shape features impedes reliable classification, especially with additional object classes added. But with this 3 class dataset, the model's performance from merely 50 input points indicates surprising generalizability from scant point data.
+
+
+
+
diff --git a/train.py b/train.py
new file mode 100644
index 0000000..40710f8
--- /dev/null
+++ b/train.py
@@ -0,0 +1,188 @@
+import numpy as np
+import argparse
+import torch
+import torch.optim as optim
+from torch.utils.tensorboard import SummaryWriter
+
+# from models import cls_model, seg_model
+from models import cls_model, seg_model
+from data_loader import get_data_loader
+from utils import save_checkpoint, create_dir
+
+def train(train_dataloader, model, opt, epoch, args, writer):
+
+ model.train()
+ model.to(args.device)
+ step = epoch*len(train_dataloader)
+ epoch_loss = 0
+
+ for i, batch in enumerate(train_dataloader):
+ point_clouds, labels = batch
+ point_clouds = point_clouds.to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ # ------ TO DO: Forward Pass ------
+ predictions = model(point_clouds)
+
+ if (args.task == "seg"):
+ labels = labels.reshape([-1])
+ predictions = predictions.reshape([-1, args.num_seg_class])
+
+ # Compute Loss
+ criterion = torch.nn.CrossEntropyLoss()
+ loss = criterion(predictions, labels)
+ epoch_loss += loss
+
+ # Backward and Optimize
+ opt.zero_grad()
+ loss.backward()
+ opt.step()
+
+ writer.add_scalar('train_loss', loss.item(), step+i)
+
+ return epoch_loss
+
+def test(test_dataloader, model, epoch, args, writer):
+
+ model.eval()
+
+ # Evaluation in Classification Task
+ if (args.task == "cls"):
+ correct_obj = 0
+ num_obj = 0
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds.to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ # ------ TO DO: Make Predictions ------
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+ correct_obj += pred_labels.eq(labels.data).cpu().sum().item()
+ num_obj += labels.size()[0]
+
+ # Compute Accuracy of Test Dataset
+ accuracy = correct_obj / num_obj
+
+
+ # Evaluation in Segmentation Task
+ else:
+ correct_point = 0
+ num_point = 0
+ for batch in test_dataloader:
+ point_clouds, labels = batch
+ point_clouds = point_clouds.to(args.device)
+ labels = labels.to(args.device).to(torch.long)
+
+ # ------ TO DO: Make Predictions ------
+ with torch.no_grad():
+ pred_labels = torch.argmax(model(point_clouds), dim=-1, keepdim=False)
+
+ correct_point += pred_labels.eq(labels.data).cpu().sum().item()
+ num_point += labels.view([-1,1]).size()[0]
+
+ # Compute Accuracy of Test Dataset
+ accuracy = correct_point / num_point
+
+ writer.add_scalar("test_acc", accuracy, epoch)
+ return accuracy
+
+
+def main(args):
+ """Loads the data, creates checkpoint and sample directories, and starts the training loop.
+ """
+
+ # Create Directories
+ create_dir(args.checkpoint_dir)
+ create_dir('./logs')
+
+ # Tensorboard Logger
+ writer = SummaryWriter('./logs/{0}'.format(args.task+"_"+args.exp_name))
+
+ # ------ TO DO: Initialize Model ------
+ if args.task == "cls":
+ model = cls_model()
+ else:
+ model = seg_model()
+
+ # Load Checkpoint
+ if args.load_checkpoint:
+ model_path = "{}/{}.pt".format(args.checkpoint_dir,args.load_checkpoint)
+ with open(model_path, 'rb') as f:
+ state_dict = torch.load(f, map_location=args.device)
+ model.load_state_dict(state_dict)
+ print ("successfully loaded checkpoint from {}".format(model_path))
+
+ # Optimizer
+ opt = optim.Adam(model.parameters(), args.lr, betas=(0.9, 0.999))
+
+ # Dataloader for Training & Testing
+ train_dataloader = get_data_loader(args=args, train=True)
+ test_dataloader = get_data_loader(args=args, train=False)
+
+ print ("successfully loaded data")
+
+ best_acc = -1
+
+ print ("======== start training for {} task ========".format(args.task))
+ print ("(check tensorboard for plots of experiment logs/{})".format(args.task+"_"+args.exp_name))
+
+ for epoch in range(args.num_epochs):
+
+ # Train
+ train_epoch_loss = train(train_dataloader, model, opt, epoch, args, writer)
+
+ # Test
+ current_acc = test(test_dataloader, model, epoch, args, writer)
+
+ print ("epoch: {} train loss: {:.4f} test accuracy: {:.4f}".format(epoch, train_epoch_loss, current_acc))
+
+ # Save Model Checkpoint Regularly
+ if epoch % args.checkpoint_every == 0:
+ print ("checkpoint saved at epoch {}".format(epoch))
+ save_checkpoint(epoch=epoch, model=model, args=args, best=False)
+
+ # Save Best Model Checkpoint
+ if (current_acc >= best_acc):
+ best_acc = current_acc
+ print ("best model saved at epoch {}".format(epoch))
+ save_checkpoint(epoch=epoch, model=model, args=args, best=True)
+
+ print ("======== training completes ========")
+
+
+def create_parser():
+ """Creates a parser for command-line arguments.
+ """
+ parser = argparse.ArgumentParser()
+
+ # Model & Data hyper-parameters
+ parser.add_argument('--task', type=str, default="cls", help='The task: cls or seg')
+ parser.add_argument('--num_seg_class', type=int, default=6, help='The number of segmentation classes')
+
+ # Training hyper-parameters
+ parser.add_argument('--num_epochs', type=int, default=150)
+ parser.add_argument('--batch_size', type=int, default=16, help='The number of images in a batch.')
+ parser.add_argument('--num_workers', type=int, default=0, help='The number of threads to use for the DataLoader.')
+ parser.add_argument('--lr', type=float, default=0.001, help='The learning rate (default 0.001)')
+
+ parser.add_argument('--exp_name', type=str, default="exp", help='The name of the experiment')
+
+ # Directories and checkpoint/sample iterations
+ parser.add_argument('--main_dir', type=str, default='./data/')
+ parser.add_argument('--checkpoint_dir', type=str, default='./checkpoints')
+ parser.add_argument('--checkpoint_every', type=int , default=10)
+
+ parser.add_argument('--load_checkpoint', type=str, default='')
+
+
+ return parser
+
+
+if __name__ == '__main__':
+ parser = create_parser()
+ args = parser.parse_args()
+ args.device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
+ args.checkpoint_dir = args.checkpoint_dir+"/"+args.task # checkpoint directory is task specific
+
+ main(args)
\ No newline at end of file
diff --git a/utils.py b/utils.py
new file mode 100644
index 0000000..4172ea4
--- /dev/null
+++ b/utils.py
@@ -0,0 +1,111 @@
+import os
+import torch
+import pytorch3d
+from pytorch3d.renderer import (
+ AlphaCompositor,
+ PointsRasterizationSettings,
+ PointsRenderer,
+ PointsRasterizer,
+)
+import imageio
+import numpy as np
+
+def save_checkpoint(epoch, model, args, best=False):
+ if best:
+ path = os.path.join(args.checkpoint_dir, 'best_model.pt')
+ else:
+ path = os.path.join(args.checkpoint_dir, 'model_epoch_{}.pt'.format(epoch))
+ torch.save(model.state_dict(), path)
+
+def create_dir(directory):
+ """
+ Creates a directory if it does not already exist.
+ """
+ if not os.path.exists(directory):
+ os.makedirs(directory)
+
+def get_points_renderer(
+ image_size=256, device=None, radius=0.01, background_color=(1, 1, 1)
+):
+ """
+ Returns a Pytorch3D renderer for point clouds.
+
+ Args:
+ image_size (int): The rendered image size.
+ device (torch.device): The torch device to use (CPU or GPU). If not specified,
+ will automatically use GPU if available, otherwise CPU.
+ radius (float): The radius of the rendered point in NDC.
+ background_color (tuple): The background color of the rendered image.
+
+ Returns:
+ PointsRenderer.
+ """
+ if device is None:
+ if torch.cuda.is_available():
+ device = torch.device("cuda:0")
+ else:
+ device = torch.device("cpu")
+ raster_settings = PointsRasterizationSettings(image_size=image_size, radius=radius,)
+ renderer = PointsRenderer(
+ rasterizer=PointsRasterizer(raster_settings=raster_settings),
+ compositor=AlphaCompositor(background_color=background_color),
+ )
+ return renderer
+
+def viz_cls (verts, path, device):
+ """
+ visualize classification result
+ output: a 360-degree gif
+ """
+ image_size=256
+ background_color=(1, 1, 1)
+
+ # Construct various camera viewpoints
+ dist = 3
+ elev = 0
+ azim = [180 - 12*i for i in range(30)]
+ R, T = pytorch3d.renderer.cameras.look_at_view_transform(dist=dist, elev=elev, azim=azim, device=device)
+ c = pytorch3d.renderer.FoVPerspectiveCameras(R=R, T=T, fov=60, device=device)
+
+ sample_verts = verts.repeat(30,1,1).to(torch.float)
+ sample_colors = torch.tensor([0.7,0.7,1.0]).repeat(1,sample_verts.shape[1],1).repeat(30,1,1).to(torch.float)
+
+ point_cloud = pytorch3d.structures.Pointclouds(points=sample_verts, features=sample_colors).to(device)
+
+ renderer = get_points_renderer(image_size=image_size, background_color=background_color, device=device)
+ rend = renderer(point_cloud, cameras=c).cpu().numpy() # (30, 256, 256, 3)
+ rend = (np.clip(rend, 0, 1) * 255).astype(np.uint8)
+
+ imageio.mimsave(path, rend, fps=15, loop = 0)
+
+def viz_seg(verts, labels, path, device):
+ """
+ Visualize segmentation result as a 360-degree gif.
+ """
+ image_size = 256
+ background_color = (1, 1, 1)
+ colors = [[1.0, 1.0, 1.0], [1.0, 0.0, 1.0], [0.0, 1.0, 1.0], [1.0, 1.0, 0.0], [0.0, 0.0, 1.0], [1.0, 0.0, 0.0]]
+
+ # Construct various camera viewpoints
+ dist = 3
+ elev = 0
+ azim = [180 - 12 * i for i in range(30)]
+ R, T = pytorch3d.renderer.cameras.look_at_view_transform(dist=dist, elev=elev, azim=azim, device=device)
+ c = pytorch3d.renderer.FoVPerspectiveCameras(R=R, T=T, fov=60, device=device)
+
+ sample_verts = verts.unsqueeze(0).repeat(30, 1, 1).to(torch.float)
+ sample_labels = labels.unsqueeze(0).repeat(30, 1) # Repeat labels for each viewpoint
+ sample_colors = torch.zeros_like(sample_verts) # Use the same shape as sample_verts
+
+ # Colorize points based on segmentation labels
+ for i in range(6):
+ sample_colors[sample_labels == i] = torch.tensor(colors[i])
+
+ point_cloud = pytorch3d.structures.Pointclouds(points=sample_verts, features=sample_colors).to(device)
+
+ renderer = get_points_renderer(image_size=image_size, background_color=background_color, device=device)
+ rend = renderer(point_cloud, cameras=c).cpu().numpy() # (30, 256, 256, 3)
+ rend = (np.clip(rend, 0, 1) * 255).astype(np.uint8)
+
+ imageio.mimsave(path, rend, fps=15, loop=0)
+