PF-Net 3D Point-Cloud Completion
A different data modality: 3D unordered point sets. GAN-based completion with PF-Net (Point Fractal Network) — on ShapeNet-Part, a self-supervised 512-point crop as GT, a multi-scale FPS encoder (1920-d) + residual pyramid decoder fill the hole coarse(64)→center2(128)→fine(512), constrained by Chamfer Distance + an adversarial loss.
Full cloud → FPS-crop the nearest 512 points (a hole appears) → multi-scale FPS pyramid → the fractal decoder fills the hole coarse(64) → center2(128) → fine(512), with a live Chamfer Distance readout and an adversarial-loss toggle.
Why this local version exists
The crop strategy (5 viewpoints + distance-sort crop 512), point_scales_list, 1920-d latent, residual pyramid decoder, and errG/errG_l2 loss are PF-Net's real structure from the course's 3D point-cloud source. The course is HLS video with no subtitles, so architecture is reconstructed from code + titles (high confidence); weights are not shipped, the cloud/coords/CD values are illustrative, and no model runs in the browser.
PF-Net point-cloud completion: crop a region, regenerate it
Full cloud → FPS-crop 512 points to leave a hole → multi-scale FPS encoder → a fractal pyramid decoder fills the hole coarse(64) → center2(128) → fine(512), ending in a Chamfer Distance readout.
2D-projected point cloud
* Grey = partial input; green = decoder output; amber ring = crop viewpoint. Shape + coords are illustrative; weights not shipped.
fractal pyramid decoder
Chamfer Distance ×100 (fine ↔ GT)
—errG_l2 = CD(fine,gt) + α₁·CD(center1,key1) + α₂·CD(center2,key2). Lower CD is better; it drops coarse→fine.
adversarial loss (D_choose)
errG = (1−wtl2)·BCE_adv + wtl2·errG_l2 · wtl2=0.95
The _netlocalD discriminator pushes the filled region toward a realistic surface, penalizing over-smooth solutions.
What to try
Run it and watch FPS crop the nearest 512 points, leaving a hole in the partial cloud.
Watch the decoder fill the hole coarse → fine (center1 64 → center2 128 → fine 512), with Chamfer Distance dropping each level.
Toggle the adversarial loss (D_choose) on/off to see reconstruction-only vs GAN-assisted.
What this demo proves
You're comfortable with unordered 3D set data — FPS, Chamfer Distance, permutation-invariant max-pooling.
You can do encoder-decoder + GAN beyond 2D: structured generation, not classification/detection.
You understand multi-scale residual design — why a coarse→fine pyramid beats regressing 512 points directly.
Task
Point-cloud completion (GAN) · ShapeNet-Part 16 classes · npoints 2048
Architecture
Multi-scale FPS encoder (1920-d) → residual pyramid decoder 64→128→512
Loss
Chamfer Distance ×100 + adversarial · errG_l2 weight wtl2=0.95