인스톨! 파이토치 10강 | PyTorch 고급 테크닉: 모델 내부 탐색과 Sequential vs ModuleList

728x90

[AI 인공지능 머신러닝 딥러닝/Python | PyTorch] - 인스톨! 파이토치 강의 소개

인스톨! 파이토치 강의 소개

혁펜하임 PyTorch 강의 오리엔테이션 요약혁펜하임 채널의 '[PyTorch] 0강. 오리엔테이션' 영상은 채널 5주년 기념으로 '인스톨! 파이토치' 강의를 소개하는 내용입니다. 강의자는 최근 출간한 '이론

inner-game.tistory.com

10-1강. 딥러닝 고수가 되기 위해서는 꼭 알아야 하는 녀석들

PyTorch의 nn.Module은 딥러닝 모델을 구성하는 기본 클래스로, 모델의 레이어와 파라미터를 관리하는 강력한 메소드들을 제공합니다. 이러한 메소드들을 이해하면 모델의 내부 구조를 자유롭게 탐색하고 조작할 수 있습니다.[^1]

parameters() vs named_parameters()

parameters()는 모델의 모든 학습 가능한 파라미터(weight, bias 등)를 반환합니다. 이는 torch.Tensor 형태의 파라미터들을 순회 가능한 generator로 제공합니다.[^2]

named_parameters()는 파라미터의 이름과 함께 반환합니다. (name, parameter) 튜플 형태로 제공되어 각 파라미터가 어느 레이어에 속하는지 명확히 알 수 있습니다.[^2]

import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 64, 3)
        self.fc1 = nn.Linear(64, 10)

model = MyModel()

# parameters() 사용
for param in model.parameters():
    print(param.shape)

# named_parameters() 사용
for name, param in model.named_parameters():
    print(f"{name}: {param.shape}")
    # 출력: conv1.weight: torch.Size([64, 3, 3, 3])
    #      conv1.bias: torch.Size([^64])
    #      fc1.weight: torch.Size([10, 64])
    #      fc1.bias: torch.Size([^10])

children() vs modules()

children()은 모델의 직접적인 자식 모듈(submodule)만 반환합니다. 즉, __init__에서 정의한 멤버 변수들만 가져옵니다.[^3][^2]

modules()는 모델에 사용된 모든 nn.Module을 재귀적으로 반환합니다. 자기 자신(모델)부터 시작하여 모든 하위 모듈을 포함합니다.[^4][^2]

class ConvBlock(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 64, 3)
        self.bn = nn.BatchNorm2d(64)
        self.relu = nn.ReLU()

class MyNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.block1 = ConvBlock()
        self.fc = nn.Linear(64, 10)

model = MyNet()

print("--- children() ---")
for child in model.children():
    print(child)
# 출력: ConvBlock(), Linear()

print("--- modules() ---")
for module in model.modules():
    print(module)
# 출력: MyNet(), ConvBlock(), Conv2d(), BatchNorm2d(), ReLU(), Linear()

이처럼 children()은 1단계 깊이만 탐색하지만, modules()는 중첩된 모든 모듈을 재귀적으로 탐색합니다.[^5][^4]

named_children() vs named_modules()

named_가 붙은 버전들은 각각 모듈의 이름과 함께 반환합니다:[^3][^4]

for name, module in model.named_children():
    print(f"Name: {name}, Module: {module}")
# 출력: Name: block1, Module: ConvBlock()
#      Name: fc, Module: Linear()

for name, module in model.named_modules():
    print(f"Name: {name}, Module: {module}")
# 출력: Name: , Module: MyNet()  (루트 모델)
#      Name: block1, Module: ConvBlock()
#      Name: block1.conv, Module: Conv2d()
#      ... (모든 중첩 모듈)

실전 활용 예시

1. 특정 레이어만 학습 (Fine-tuning)

# Conv 레이어만 동결
for name, param in model.named_parameters():
    if 'conv' in name:
        param.requires_grad = False

2. 가중치 초기화

for module in model.modules():
    if isinstance(module, nn.Conv2d):
        nn.init.kaiming_normal_(module.weight)
    elif isinstance(module, nn.Linear):
        nn.init.xavier_normal_(module.weight)

3. 모델 구조 출력

for name, module in model.named_modules():
    print(f"{name}: {module.__class__.__name__}")

10-2강. 상위 1%만 아는 둘의 차이! nn.Sequential() vs nn.ModuleList()

nn.Sequential()

nn.Sequential은 레이어들을 순차적으로 연결하는 컨테이너입니다. 데이터가 각 레이어를 차례대로 통과하며, forward() 메소드가 자동으로 정의됩니다.[^6][^7][^8]

# Sequential 사용 예시
model = nn.Sequential(
    nn.Conv2d(3, 64, 3),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Linear(64, 10)
)

# forward() 자동 정의됨
output = model(input)  # 순차적으로 실행

Sequential의 특징

자동 forward 패스: 레이어들이 정의된 순서대로 자동 실행[^7][^6]
순차적 연결: 각 레이어의 출력이 다음 레이어의 입력이 됨
간단한 구조: 단순한 feed-forward 네트워크에 적합[^7]
출력 크기 주의: 각 레이어의 출력 크기와 다음 레이어의 입력 크기가 일치해야 함[^6]

nn.ModuleList()

nn.ModuleList는 모듈들을 리스트로 저장하는 컨테이너입니다. 연결 관계가 정의되지 않아 forward() 메소드를 직접 구현해야 합니다.[^8][^6]

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.ModuleList([
            nn.Linear(10, 20),
            nn.Linear(20, 30),
            nn.Linear(30, 40)
        ])

    def forward(self, x):
        # forward 직접 정의 필요
        for layer in self.layers:
            x = layer(x)
        return x

ModuleList의 특징

수동 forward 정의: 연결 방식을 직접 구현해야 함[^6][^7]
유연한 구조: 복잡한 로직, 조건부 실행, 동적 구조 가능[^7]
파라미터 자동 등록: 리스트에 담긴 모듈의 파라미터가 자동으로 인식됨[^7]
동적 추가/제거: 런타임에 모듈 추가나 제거 가능[^7]

핵심 차이점 비교

특징	nn.Sequential	nn.ModuleList
forward() 메소드	자동 정의됨	수동 정의 필요
연결 방식	순차적 고정	유연하게 커스터마이징
사용 난이도	쉬움	상대적으로 복잡
적합한 상황	단순 feed-forward	복잡한 아키텍처
조건부 실행	불가능	가능
동적 구조	정적	동적 가능

언제 무엇을 사용할까?

nn.Sequential을 사용하는 경우

간단한 순차 구조: VGG처럼 레이어가 일렬로 연결된 경우
빠른 프로토타입: 간단한 실험이나 테스트 모델

명확한 순서: 데이터 흐름이 단방향이고 명확한 경우

# VGG-style 블록
block = nn.Sequential(
 nn.Conv2d(64, 128, 3, padding=1),
 nn.ReLU(inplace=True),
 nn.Conv2d(128, 128, 3, padding=1),
 nn.ReLU(inplace=True),
 nn.MaxPool2d(2, 2)
)

nn.ModuleList를 사용하는 경우

복잡한 데이터 흐름: skip connection, attention 등
조건부 실행: 특정 조건에 따라 레이어 선택
동적 네트워크: 런타임에 구조가 변하는 경우
반복 구조: 같은 모듈을 여러 번 적용
# ResNet-style skip connection class ResBlock(nn.Module): def __init__(self): super().__init__() self.layers = nn.ModuleList([ nn.Conv2d(64, 64, 3, padding=1), nn.ReLU(), nn.Conv2d(64, 64, 3, padding=1) ]) def forward(self, x): residual = x for layer in self.layers: x = layer(x) return x + residual # skip connection

고급 활용: 조건부 실행

class ConditionalNet(nn.Module):
    def __init__(self, use_attention=True):
        super().__init__()
        self.layers = nn.ModuleList([
            nn.Linear(100, 200),
            nn.ReLU()
        ])
        if use_attention:
            self.layers.append(AttentionLayer())
        self.layers.append(nn.Linear(200, 10))

    def forward(self, x):
        for layer in self.layers:
            x = layer(x)
        return x

동적 네트워크 예시

class DynamicNet(nn.Module):
    def __init__(self, num_layers):
        super().__init__()
        self.layers = nn.ModuleList()
        for _ in range(num_layers):
            self.layers.append(nn.Linear(100, 100))

    def forward(self, x):
        # 런타임에 일부 레이어만 사용
        for i, layer in enumerate(self.layers):
            if i % 2 == 0:  # 짝수 번째만 실행
                x = layer(x)
        return x

주의사항

잘못된 사용 예시

# ❌ Python list 사용 (파라미터가 등록되지 않음!)
self.layers = [nn.Linear(10, 20), nn.Linear(20, 30)]

# ✅ nn.ModuleList 사용 (파라미터 정상 등록)
self.layers = nn.ModuleList([nn.Linear(10, 20), nn.Linear(20, 30)])

일반 Python 리스트를 사용하면 PyTorch가 파라미터를 인식하지 못해 학습이 되지 않습니다. 반드시 nn.ModuleList를 사용해야 합니다.[^7]

결론

parameters(), modules(), children() 메소드들은 모델의 내부 구조를 탐색하고 조작하는 핵심 도구입니다. 전이 학습, 가중치 초기화, 선택적 학습 등 고급 기법을 구현할 때 필수적입니다.

nn.Sequential은 단순하고 직관적인 순차 구조에 적합하며, nn.ModuleList는 복잡하고 유연한 아키텍처 구현에 적합합니다. 상황에 맞는 적절한 도구를 선택하는 것이 효율적인 PyTorch 모델 설계의 핵심입니다.

[AI 인공지능 머신러닝 딥러닝/Python | PyTorch] - 인스톨! 파이토치 11강 | PyTorch 커스텀 데이터셋 완벽 가이드: 나만의 데이터로 딥러닝 시작하기

인스톨! 파이토치 11강 | PyTorch 커스텀 데이터셋 완벽 가이드: 나만의 데이터로 딥러닝 시작하기

[AI 인공지능 머신러닝 딥러닝/Python | PyTorch] - 인스톨! 파이토치 강의 소개 인스톨! 파이토치 강의 소개혁펜하임 PyTorch 강의 오리엔테이션 요약혁펜하임 채널의 '[PyTorch] 0강. 오리엔테이션' 영상