Add ViT models, HSI Hang2020, inference API, and taxonomic level classification#13
Open
Ritesh313 wants to merge 5 commits intoGatorSense:mainfrom
Open
Add ViT models, HSI Hang2020, inference API, and taxonomic level classification#13Ritesh313 wants to merge 5 commits intoGatorSense:mainfrom
Ritesh313 wants to merge 5 commits intoGatorSense:mainfrom
Conversation
Major addition:
- Complete inference API for loading models and making predictions
- Support for species-level (167 classes) and genus-level (60 classes) classification
- TreeClassifier class with from_checkpoint() and predict() methods
- Label mapping system with JSON metadata files
- Image preprocessing pipeline for various input formats
Core enhancements:
- DataModule now supports taxonomic_level parameter ('species' or 'genus')
- Genus extraction via species_name.split()[0] for 60-class classification
- WeightedRandomSampler support for class balancing
- External test set with species overlap filtering
Documentation:
- Comprehensive docs/taxonomic_levels.md guide (314 lines)
- Label inspection script for validation
- Test scripts for inference verification
- Examples of progressive training (genus → species)
Files added:
- neon_tree_classification/inference/ (complete module)
- docs/taxonomic_levels.md
- scripts/create_label_mappings.py
- scripts/test_inference.py
- processing/misc/inspect_labels.py
Modified:
- neon_tree_classification/core/datamodule.py (+163 lines)
- neon_tree_classification/core/dataset.py (+77 lines)
- examples/train.py (+21 lines)
- docs/training.md (+18 lines)
This enables:
1. Quick model deployment with TreeClassifier.from_checkpoint()
2. Flexible training at species or genus level
3. Production-ready inference with batch prediction
4. Label mapping files for HuggingFace upload
Breaking changes: None (backward compatible)
…ation
BREAKING CHANGES:
- Default RGB image size changed from 128x128 to 224x224
- Default RGB normalization changed from 0_1 to imagenet
For backward compatibility, explicitly pass rgb_size=(128, 128) and rgb_norm_method='0_1'
New Features:
- Add Vision Transformer (ViT) support: vit_b_16, vit_b_32, vit_l_16, vit_l_32
- Implement Hang2020 dual-pathway attention architecture for HSI classification
- Add model_variant parameter to training script for architecture selection
- Add preliminary DeepForest CropModel compatibility methods (WIP):
* normalize() method for transforms
* label_dict persistence in checkpoints
* set_label_dict() and get_label_dict() helpers
- Add HuggingFace upload script (experimental, needs further testing)
- Add multi-output training support with auxiliary losses (Hang2020)
Improvements:
- Better experiment naming to prevent collisions in SLURM array jobs
- Enhanced test logging with detailed statistics
- Add rgb_size and rgb_norm_method CLI arguments for flexibility
- Update README with project roadmap
Note: Full DeepForest CropModel integration and HuggingFace loading
are still in progress and may require additional work.
Files changed: 10 files
- Added: scripts/upload_to_huggingface.py, sample_plots/test_PSMEM_douglas_fir.png
- Modified: train.py, rgb_models.py, hsi_models.py, lightning_modules.py,
dataset.py, datamodule.py, README.md, visualization.ipynb
There was a problem hiding this comment.
Pull request overview
This PR expands NeonTreeClassification with additional model architectures (ViT for RGB and Hang2020 for HSI), adds an inference API (TreeClassifier) with taxonomic-level (species/genus) support, and updates training/docs/scripts to support the new workflows.
Changes:
- Added ViT RGB model variants and Hang2020 HSI architecture with multi-output (aux loss) training support.
- Introduced
neon_tree_classification.inferencemodule (preprocessing, label mappings, registry, predictor API) plus scripts for label mapping creation and HF uploads. - Added genus-level training support in the DataModule (label mapping + optional balanced sampling) and updated docs/examples accordingly.
Reviewed changes
Copilot reviewed 21 out of 23 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/upload_to_huggingface.py | New utility to convert/upload checkpoints to HF; writes config/model card and safetensors. |
| scripts/test_inference.py | New manual inference “test” script for running predictions from HDF5 samples. |
| scripts/create_label_mappings.py | New script to generate species/genus label mapping JSONs for inference. |
| sample_plots/test_PSMEM_douglas_fir.png | Adds an example plot asset. |
| processing/misc/inspect_labels.py | Adds a label/genus inspection script to validate genus extraction assumptions. |
| notebooks/visualization.ipynb | Updates notebook outputs/cell execution metadata and sample CSV path. |
| neon_tree_classification/models/rgb_models.py | Adds ViT-based RGB classifier and extends RGB model factory. |
| neon_tree_classification/models/lightning_modules.py | Adds DeepForest compatibility helpers; adds multi-output loss support for HSI classifier. |
| neon_tree_classification/models/hsi_models.py | Implements Hang2020 dual-pathway attention architecture and adds it to the factory. |
| neon_tree_classification/inference/utils.py | Adds utilities for label loading, prediction formatting, checkpoint metadata extraction, etc. |
| neon_tree_classification/inference/preprocessing.py | Adds image loading/resizing/normalization/tensor preparation pipeline for inference. |
| neon_tree_classification/inference/predictor.py | Adds TreeClassifier high-level inference API. |
| neon_tree_classification/inference/model_registry.py | Adds a local registry for known pretrained model metadata and label mapping resolution. |
| neon_tree_classification/inference/label_mappings/species_labels.json | Adds committed species label mapping used by inference. |
| neon_tree_classification/inference/label_mappings/genus_labels.json | Adds committed genus label mapping used by inference. |
| neon_tree_classification/inference/init.py | Exposes inference API/public functions via package imports. |
| neon_tree_classification/core/dataset.py | Updates default RGB preprocessing (224 + ImageNet) and adds genus mapping support in label validation/lookup. |
| neon_tree_classification/core/datamodule.py | Adds taxonomic_level and optional WeightedRandomSampler balanced sampling; genus label mapping creation. |
| examples/train.py | Adds CLI args for model variant, taxonomic level, RGB size/norm; passes idx_to_label for checkpoint metadata. |
| docs/training.md | Updates baseline result table and notes for new configs. |
| docs/taxonomic_levels.md | New guide for genus/species training and filtering guidance. |
| README.md | Updates project status lines and notes. |
| .gitignore | Adds an extra ignored markdown file entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove sys.path manipulation from predictor.py (use package imports directly) - Remove unused OrderedDict import in create_label_mappings.py - Update preprocessing defaults to 224x224 and imagenet normalization - Update preprocess_image_batch and resize_image defaults to match - Fix normalize_rgb docstring to accurately describe both normalization modes - Update model_registry input_size to 224x224 and add norm_method field - Make TreeClassifier norm_method configurable (default: imagenet) - Fix predictor to use self.norm_method instead of hardcoded '0_1' - Update from_checkpoint to use (224, 224) and imagenet defaults - Add rgb_norm_method param to RGBClassifier; normalize() now reflects it - Validate numeric_to_label_dict in upload_to_huggingface.py - Fix docs: species_filter is inclusion filter, not exclusion - Fix docs: add --csv_path to inspect_labels.py example commands - Fix warning message: clarify species_filter is an inclusion filter
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 25 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
aehampton99
approved these changes
Feb 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds two major feature sets to the NeonTreeClassification pipeline: expanded model architectures and a complete inference API with taxonomic level support.
Commit 1 — New model architectures & DeepForest compatibility groundwork
Breaking changes:
0_1→ imagenetrgb_size=(128, 128)andrgb_norm_method='0_1'explicitlyNew features:
vit_b_16,vit_b_32,vit_l_16,vit_l_32model_variantCLI parameter for architecture selectionCropModelcompatibility methods (normalize(),label_dictpersistence,set_label_dict(),get_label_dict()) — WIPCommit 2 — Inference API & taxonomic level classification
New features:
neon_tree_classification.inferencemodule withTreeClassifierclassTreeClassifier.from_checkpoint()for easy model loadingpredict()with batch support and image preprocessing pipelinetaxonomic_levelparameter inDataModule('species'or'genus')WeightedRandomSamplerfor class balancingDocumentation:
No breaking changes in this commit.
Testing
TreeClassifierloading and prediction