[deep learning] vit-adapter를 이용한 애완동물 segmentation

728x90

banmo에 나오는 segmentation 모델 바꿔보기...

기존 모델을 분석하자

과제는 위 동물에대해 segmentation을 진행하는 것이다.

cfg.MODEL_WEIGHTS 가 보이는가? 거기에서 힌트를 찾을 수 있다.

기존 banmo의 경우 rcnnX101을 이용해서 instance segmentation을 진행하고 있었다.

pointrend_rcnn_X_101_32x8d_FPN_3x_coco로 인퍼런스한 애완동물 사진

detectron2에서 나름 최신 pretrained model을 이용하고 있을거라 추정되는데... 과연 어떤 모델인지는 두고보자.

detectron2는 뭐냐하면?

https://github.com/facebookresearch/detectron2

GitHub - facebookresearch/detectron2: Detectron2 is a platform for object detection, segmentation and other visual recognition t

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks. - GitHub - facebookresearch/detectron2: Detectron2 is a platform for object detection, segmentation a...

github.com

object detction, instance segmentation 관련 프레임워크다.

위에서 config에서 사용하고 있는 모델의 경우 https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_X_101_32x8d_FPN_3x_coco/28119989/model_final_ba17b9.pkl

위의 mask AP는 44.7에 달한다.

그렇다면 detectron2 에서 기본적으로 제공해주는 pretrained model들의 성능은 어떨까 pointRend랑 비교해보자.

위의 detectron2 링크에서 Model zoo 로 들어가보면...

https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md

GitHub - facebookresearch/detectron2: Detectron2 is a platform for object detection, segmentation and other visual recognition t

github.com

Mask AP가 39.5...?

좀 더 밑으로 내려보면.... R101-FPN, Renset101 + FPN으로 한 모델의 성능이 그나마 제일 높아 보인다. mask AP 43.7!

그래도 원래 쓰던 모델에 비해선 성능이 안 좋다.... 그래서 다른 모델을 좀 더 뒤적 거려보려던 찰나,,, 우선 state of the art model로 승부를 보자는 생각이 들었다.

state of the art로 승부하자

https://github.com/czczup/ViT-Adapter/tree/main/segmentation

GitHub - czczup/ViT-Adapter: Vision Transformer Adapter for Dense Predictions

Vision Transformer Adapter for Dense Predictions. Contribute to czczup/ViT-Adapter development by creating an account on GitHub.

github.com

여기서 coco-stuff-164k 모델을 사용해서 인퍼런스하면 다음과 같은 결과가 나온다.

오히려 결과가 안 좋은거 같은데...?

pascal 모델로 인퍼런스한 결과

잉?? 결과가 대체 왜 이런거야... 출력 결과를 하나하나 뜯어보자.

내가 포멧을 잘못 잡아서 그랬다... 진짜 출력은 다르다.

우왕 굳

디폴트 모델이랑 비교해보면?

자세히보면 동물 줄은 마스킹이 안 되어 있고 몸통에서 턱 부분이 조금더 정밀해졌다!

몇 장 더 인퍼런스 해보자

제법 성능이 좋아보인다...

강아지에대한 레이블이 몇인지 확인하고 전체 이미지를 싹 다 세그멘테이션 진행해보자.

https://blog.naver.com/growth-kim/222810606045

Dfinite 소개 AI 개발 딥러닝 머신러닝

Dfinite 소개 기본적인 데이터 분석, 크롤링을 이용한 데이터 수집부터 머신러닝 예측 모델 개발, 감성 분...

blog.naver.com

'Data-science > deep learning' 카테고리의 다른 글

[pytorch] AttributeError: module 'distutils' has no attribute 'version' 에러 해결 (0)	2022.07.28
[deep learning] e4e(encoder4editing) 를 이용한 연예인 얼굴 인코딩 (feat. IU 웃게 만들기) - 1탄 (0)	2022.07.18
[pytorch] Expected cuda got cpu, 혹은 타입 에러 발생시 (0)	2022.03.20
[pytorch] one-hot encoding이 반드시 필요할까? (0)	2022.03.09
[pytorch] pytorch cross entropy 사용시 주의할 점, tf sparse categorical cross entropy in pytorch? (0)	2022.03.09

성장하는 나날들

[deep learning] vit-adapter를 이용한 애완동물 segmentation

banmo에 나오는 segmentation 모델 바꿔보기...

기존 모델을 분석하자

state of the art로 승부하자

'Data-science > deep learning' 카테고리의 다른 글

티스토리툴바

[deep learning] vit-adapter를 이용한 애완동물 segmentation

banmo에 나오는 segmentation 모델 바꿔보기...

기존 모델을 분석하자

state of the art로 승부하자

'Data-science > deep learning' 카테고리의 다른 글

'Data-science/deep learning' Related Articles

티스토리툴바