Apollo Studio ile GraphQL API Performans İzleme

GraphQL API’lerinizi production’a aldığınızda işin gerçek zorluğu başlıyor. Hangi query’ler yavaş çalışıyor? Hangi field’lar hiç kullanılmıyor? Hata oranları nerede yoğunlaşıyor? Bu soruların cevabını bulmak için Apollo Studio tam olarak ihtiyacınız olan araç. Bugün Apollo Studio’yu gerçek bir projeye nasıl entegre edeceğinizi, performans metriklerini nasıl okuyacağınızı ve bu verileri kullanarak API’nizi nasıl optimize edeceğinizi adım adım inceleyeceğiz.

Apollo Studio Nedir ve Neden Kullanmalısınız?

Apollo Studio, Apollo’nun sunduğu bulut tabanlı bir izleme ve analiz platformu. Eskiden Apollo Engine ve Graphql ile aynı ekosistemde yaşayan bu araç, artık oldukça olgun bir hal aldı. Ücretsiz tier’ı bile production ortamları için oldukça yeterli.

Temel olarak şunları yapabiliyorsunuz:

Operation izleme: Her GraphQL operasyonunun çalışma süresi, hata durumu ve kullanım sıklığı
Field kullanım analizi: Schema’nızdaki hangi field’ların kullanıldığını ve kullanılmadığını görme
Tracing: Request bazlı detaylı zaman çizelgesi
Alerting: Latency spike’larında veya hata artışlarında bildirim alma
Schema registry: Schema versiyonlarını yönetme ve breaking change kontrolü

Peki bu kadar güzel özellik varken neden herkes kullanmıyor? Çünkü doğru kurulum yapmadan metrikler anlamsız gürültüye dönüşüyor. Şimdi bunu doğru yapmanın yolunu gösterelim.

Ortam Kurulumu ve İlk Entegrasyon

Önce Apollo Server projenizin temelini atalım. Burada TypeScript kullanan, production-ready bir kurulum yapacağız.

mkdir apollo-monitoring-demo
cd apollo-monitoring-demo
npm init -y
npm install @apollo/server graphql @apollo/usage-reporting-protobuf
npm install typescript ts-node @types/node --save-dev
npx tsc --init

Apollo Studio entegrasyonu için önce studio.apollographql.com adresine gidip bir hesap açmanız ve bir graph oluşturmanız gerekiyor. Oradan alacağınız API key’i environment variable olarak saklayacağız.

# .env dosyası oluşturun
cat > .env << 'EOF'
APOLLO_KEY=service:my-graph:xxxxxxxxxxxxxxxxxxxxxx
APOLLO_GRAPH_REF=my-graph@production
PORT=4000
NODE_ENV=production
EOF

Şimdi temel server yapımızı kuralım. Gerçek dünyada kullanacağınız bir e-ticaret API’sini örnek alıyorum:

cat > src/index.ts << 'EOF'
import { ApolloServer } from '@apollo/server';
import { expressMiddleware } from '@apollo/server/express4';
import { ApolloServerPluginUsageReporting } from '@apollo/server/plugin/usageReporting';
import { ApolloServerPluginLandingPageProductionDefault } from '@apollo/server/plugin/landingPage/default';
import express from 'express';
import { json } from 'body-parser';
import { typeDefs } from './schema';
import { resolvers } from './resolvers';
import { createContext } from './context';

async function startServer() {
  const app = express();
  
  const server = new ApolloServer({
    typeDefs,
    resolvers,
    plugins: [
      ApolloServerPluginUsageReporting({
        sendVariableValues: { 
          // Hassas değişkenleri maskele
          exceptNames: ['password', 'creditCard', 'token'] 
        },
        sendHeaders: { 
          onlyNames: ['user-agent', 'x-client-name', 'x-client-version'] 
        },
        // Her operasyonu gönder, sampling yapma
        sendUnexecutableOperationDocuments: true,
        generateClientInfo: ({ request }) => {
          const headers = request.http?.headers;
          return {
            clientName: headers?.get('x-client-name') || 'unknown',
            clientVersion: headers?.get('x-client-version') || '0.0.0',
          };
        },
      }),
      ApolloServerPluginLandingPageProductionDefault({
        graphRef: process.env.APOLLO_GRAPH_REF,
        footer: false,
      }),
    ],
  });

  await server.start();
  
  app.use(
    '/graphql',
    json(),
    expressMiddleware(server, {
      context: createContext,
    })
  );

  app.listen(process.env.PORT || 4000, () => {
    console.log(`Server ready at http://localhost:${process.env.PORT || 4000}/graphql`);
  });
}

startServer().catch(console.error);
EOF

Schema ve Resolver Yapısı

Gerçekçi bir senaryo için bir e-ticaret schema’sı kullanalım. Bu schema üzerinden performans sorunlarını daha iyi görebileceğiz:

cat > src/schema.ts << 'EOF'
export const typeDefs = `#graphql
  type Product {
    id: ID!
    name: String!
    price: Float!
    description: String
    category: Category!
    reviews: [Review!]!
    inventory: InventoryStatus!
    relatedProducts: [Product!]!
  }

  type Category {
    id: ID!
    name: String!
    products: [Product!]!
  }

  type Review {
    id: ID!
    rating: Int!
    comment: String
    author: User!
    createdAt: String!
  }

  type User {
    id: ID!
    email: String!
    name: String!
    orders: [Order!]!
  }

  type Order {
    id: ID!
    status: OrderStatus!
    items: [OrderItem!]!
    total: Float!
    createdAt: String!
  }

  type OrderItem {
    product: Product!
    quantity: Int!
    unitPrice: Float!
  }

  type InventoryStatus {
    available: Int!
    reserved: Int!
    warehouse: String!
  }

  enum OrderStatus {
    PENDING
    PROCESSING
    SHIPPED
    DELIVERED
    CANCELLED
  }

  type Query {
    products(categoryId: ID, limit: Int, offset: Int): [Product!]!
    product(id: ID!): Product
    categories: [Category!]!
    user(id: ID!): User
    orders(userId: ID!, status: OrderStatus): [Order!]!
  }

  type Mutation {
    createOrder(userId: ID!, items: [OrderItemInput!]!): Order!
    updateOrderStatus(orderId: ID!, status: OrderStatus!): Order!
  }

  input OrderItemInput {
    productId: ID!
    quantity: Int!
  }
`;
EOF

N+1 Problemi ve DataLoader ile Çözümü

Apollo Studio kullanırken en sık karşılaşacağınız sorun N+1 query problemi. Studio’nun tracing özelliği bu sorunu görünür kılar. Şimdi hem sorunu hem de çözümünü görelim:

cat > src/dataLoaders.ts << 'EOF'
import DataLoader from 'dataloader';

// Ürün yorumlarını batch'leyerek çeken loader
export const createReviewLoader = (db: any) =>
  new DataLoader(async (productIds: readonly string[]) => {
    console.log(`Batch loading reviews for ${productIds.length} products`);
    
    // Tek sorguda tüm ürünlerin yorumlarını çekiyoruz
    const reviews = await db.query(
      `SELECT * FROM reviews WHERE product_id = ANY($1)`,
      [productIds]
    );
    
    // Her ürün ID'si için yorumları grupla
    const reviewsByProduct = productIds.map(id =>
      reviews.filter((r: any) => r.product_id === id)
    );
    
    return reviewsByProduct;
  });

// Kategori bilgilerini batch'leyen loader
export const createCategoryLoader = (db: any) =>
  new DataLoader(async (categoryIds: readonly string[]) => {
    const categories = await db.query(
      `SELECT * FROM categories WHERE id = ANY($1)`,
      [categoryIds]
    );
    
    return categoryIds.map(id =>
      categories.find((c: any) => c.id === id) || null
    );
  });

// Envanter bilgilerini batch'leyen loader
export const createInventoryLoader = (db: any) =>
  new DataLoader(async (productIds: readonly string[]) => {
    const inventory = await db.query(
      `SELECT * FROM inventory WHERE product_id = ANY($1)`,
      [productIds]
    );
    
    return productIds.map(id =>
      inventory.find((i: any) => i.product_id === id) || {
        available: 0,
        reserved: 0,
        warehouse: 'UNKNOWN'
      }
    );
  });
EOF

Custom Performans Metrikleri Ekleme

Apollo Studio’nun built-in metrikleri harika, fakat bazen domain’e özel metrikler de eklemek gerekiyor. Özellikle hangi business operasyonlarının yavaş çalıştığını anlamak için custom span’lar kullanıyoruz:

cat > src/plugins/performancePlugin.ts << 'EOF'
import { ApolloServerPlugin, GraphQLRequestContext } from '@apollo/server';

interface PerformanceMetrics {
  operationName: string | null;
  startTime: number;
  resolverTimings: Map<string, number>;
  dbQueryCount: number;
  cacheHits: number;
}

export const performancePlugin: ApolloServerPlugin = {
  async requestDidStart(requestContext) {
    const metrics: PerformanceMetrics = {
      operationName: requestContext.request.operationName || null,
      startTime: Date.now(),
      resolverTimings: new Map(),
      dbQueryCount: 0,
      cacheHits: 0,
    };

    return {
      async executionDidStart() {
        return {
          async willResolveField({ info }) {
            const fieldStart = Date.now();
            const fieldPath = `${info.parentType.name}.${info.fieldName}`;
            
            return (error) => {
              const duration = Date.now() - fieldStart;
              metrics.resolverTimings.set(fieldPath, duration);
              
              // 100ms üzerindeki resolver'ları logla
              if (duration > 100) {
                console.warn(`Slow resolver detected: ${fieldPath} took ${duration}ms`);
              }
            };
          },
        };
      },

      async willSendResponse() {
        const totalDuration = Date.now() - metrics.startTime;
        
        // Yavaş operasyonları logla ve alerting sistemine gönder
        if (totalDuration > 1000) {
          console.error(`SLOW OPERATION: ${metrics.operationName} took ${totalDuration}ms`);
          
          // En yavaş resolver'ları bul
          const slowResolvers = Array.from(metrics.resolverTimings.entries())
            .filter(([_, duration]) => duration > 50)
            .sort(([_, a], [__, b]) => b - a)
            .slice(0, 5);
            
          console.error('Slowest resolvers:', slowResolvers);
          
          // Kendi alerting sisteminize gönderebilirsiniz
          // await sendAlert({ operation: metrics.operationName, duration: totalDuration, slowResolvers });
        }
        
        // Prometheus veya benzeri sistemler için metrics emit et
        // metrics.prometheus.histogram.observe(totalDuration / 1000);
      },

      async didEncounterErrors({ errors }) {
        errors.forEach(error => {
          console.error(`GraphQL Error in ${metrics.operationName}:`, {
            message: error.message,
            path: error.path,
            extensions: error.extensions,
          });
        });
      },
    };
  },
};
EOF

Apollo Studio’da Tracing Verilerini Okuma

Studio’da tracing verilerini anlamlandırmak başlı başına bir beceri. Birkaç kritik metriğe odaklanmanız gerekiyor:

p50, p95, p99 Latency Değerleri: p50 medyan değerdir, kullanıcıların yarısının deneyimlediği performansı gösterir. p99 ise en kötü %1’lik dilimi temsil eder. Production’da p99 değerinin p50’nin 3 katını geçmesi ciddi bir optimizasyon sinyalidir.

Error Rate: Operasyon bazında hata oranları. %1 üzerindeki hata oranları acil müdahale gerektirir.

Request Count: Hangi operasyonlar ne sıklıkla çağrılıyor. Bunu bilerek caching stratejisi kurabilirsiniz.

Cache Hit Rate: Apollo’nun response cache’i kullanıyorsanız bu değer optimizasyon fırsatlarını gösterir.

Field Usage raporları özellikle schema temizliği yaparken çok değerli. Hiç kullanılmayan field’ları deprecated olarak işaretleyip belirli bir süre sonra kaldırabilirsiniz. Bunu yapmak için şu akışı izleyin:

Studio’da “Fields” sekmesine gidin
“Unused fields” filtresini aktif edin
Son 30 günde sıfır request alan field’ları listeleyin
Bu field’ları schema’nızda @deprecated ile işaretleyin
60 gün sonra güvenle kaldırın

Alerting Konfigürasyonu

Apollo Studio’nun alerting özelliğini doğru kurmak monitoring stratejinizin temelidir. Studio UI üzerinden yapabileceğiniz kurulumları destekleyen bir webhook handler yazalım:

cat > src/routes/webhooks.ts << 'EOF'
import { Router } from 'express';

const router = Router();

interface ApolloAlert {
  type: 'ERROR_PERCENTAGE' | 'REQUEST_COUNT' | 'LATENCY_MS';
  operationName: string;
  currentValue: number;
  threshold: number;
  graphId: string;
  variantName: string;
}

router.post('/apollo-alerts', async (req, res) => {
  const alert: ApolloAlert = req.body;
  
  console.log(`Apollo Studio Alert received:`, alert);
  
  // Slack bildirimi
  if (alert.type === 'ERROR_PERCENTAGE' && alert.currentValue > 5) {
    await sendSlackNotification({
      channel: '#api-alerts',
      message: `🚨 *Kritik Hata Oranı!*
      Operasyon: ${alert.operationName}
      Mevcut Hata Oranı: %${alert.currentValue.toFixed(2)}
      Eşik: %${alert.threshold}
      Graph: ${alert.graphId}@${alert.variantName}`,
    });
  }
  
  // PagerDuty veya benzeri on-call sisteminize entegre edin
  if (alert.type === 'LATENCY_MS' && alert.currentValue > 2000) {
    await triggerPagerDuty({
      severity: 'high',
      summary: `GraphQL Latency Spike: ${alert.operationName}`,
      details: alert,
    });
  }
  
  res.json({ received: true });
});

async function sendSlackNotification(payload: any) {
  // Slack webhook entegrasyonunuz
  const webhookUrl = process.env.SLACK_WEBHOOK_URL;
  if (!webhookUrl) return;
  
  await fetch(webhookUrl, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text: payload.message, channel: payload.channel }),
  });
}

async function triggerPagerDuty(payload: any) {
  // PagerDuty entegrasyonunuz
  console.log('PagerDuty trigger:', payload);
}

export default router;
EOF

Schema Registry ve Breaking Change Yönetimi

Apollo Studio’nun en kritik özelliklerinden biri schema registry. CI/CD pipeline’ınıza entegre etmeden geçmeyin:

# CI/CD pipeline'ınıza ekleyeceğiniz komutlar
# Önce Rover CLI'ı kurun
curl -sSL https://rover.apollo.dev/nix/latest | sh

# Schema'nızı registry'e publish edin
rover graph publish my-graph@production 
  --schema ./src/schema.graphql 
  --name production-deploy-$(date +%Y%m%d%H%M%S)

# Breaking change kontrolü yapın - bu komutu merge öncesi çalıştırın
rover graph check my-graph@production 
  --schema ./src/schema.graphql 
  --query-count-threshold 10 
  --query-count-threshold-percentage 3

Breaking change kontrolünü GitHub Actions’a entegre etmek için şu workflow’u kullanabilirsiniz:

cat > .github/workflows/schema-check.yml << 'EOF'
name: Apollo Schema Check

on:
  pull_request:
    paths:
      - 'src/schema.ts'
      - 'src/schema.graphql'

jobs:
  schema-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Install Rover
        run: curl -sSL https://rover.apollo.dev/nix/latest | sh
        
      - name: Add Rover to PATH
        run: echo "$HOME/.rover/bin" >> $GITHUB_PATH
        
      - name: Extract Schema
        run: npx ts-node src/extractSchema.ts > schema.graphql
        
      - name: Check Schema for Breaking Changes
        env:
          APOLLO_KEY: ${{ secrets.APOLLO_KEY }}
        run: |
          rover graph check my-graph@production 
            --schema ./schema.graphql 
            --background false
EOF

Production’da Öğrenilen Dersler

Gerçek bir production ortamında Apollo Studio kullanırken karşılaştığım ve sizin de karşılaşacağınız durumlar:

Sampling stratejisi: Yüksek trafikli API’lerde her operasyonu Studio’ya göndermek hem maliyetli hem de performansa etki eder. Ücretsiz tier’da zaten limit var. Çözüm olarak kritik operasyonları her zaman, diğerlerini örnekleyerek gönderebilirsiniz.

Hassas veri maskeleme: Usage reporting’de request body gönderildiğinden PII verilerinin maskelenmesi zorunlu. Bunu başlangıçta doğru yapmazsanız sonradan temizlemek gerçekten zahmetli oluyor.

Client version tracking: Mobile app’larda özellikle önemli. Hangi app versiyonu hangi query’leri kullanıyor bilgisi, eski API field’larını ne zaman kaldırabileceğinizi söyler. generateClientInfo callback’ini mutlaka implemente edin.

Operasyon isimlendirme: Anonymous query’ler Studio’da analiz edilemez. Tüm client-side query’lerinize anlamlı isimler verin. query GetUserProfile yazmak query yazmaktan çok daha değerli metrikler üretir.

Segment bazlı latency analizi: Tüm userlar için tek bir latency metriği yeterli değil. Authenticated vs anonymous, mobile vs desktop, region bazlı ayrımlar yapın. Bunu generateClientInfo ile context’e ekleyebilirsiniz.

Monitör etmeniz gereken threshold değerleri için genel kabul görmüş değerler şunlar:

p50 latency: 200ms altı iyi, 200-500ms kabul edilebilir, 500ms üzeri optimizasyon gerekli
p99 latency: 2000ms altı hedef, üzeri kritik
Error rate: %0.1 altı mükemmel, %1 üzeri acil
Cache hit rate: Response cache kullanıyorsanız %60 altı düşük

Sonuç

Apollo Studio, GraphQL API’lerinizi kör uçmaktan kurtaran en değerli araç. Doğru kurulumla hangi query’lerin yavaş çalıştığını, hangi field’ların ölü kod haline geldiğini ve breaking change’lerin client’ları nasıl etkileyeceğini önceden görmenizi sağlıyor.

Bugün anlattığımız şeyleri şu sırayla implemente etmenizi öneririm: Önce temel entegrasyonu yapın ve birkaç gün veri toplamasına izin verin. Sonra N+1 problemlerini DataLoader ile çözün. Ardından alerting kurun ve son olarak CI/CD’ye schema check ekleyin. Bu dört adım, GraphQL API yönetiminizi reaktif olmaktan proaktif olmaya taşır.

Ücretsiz tier ile başlayıp ne kadar değer ürettiğini gördükten sonra paid tier’ı değerlendirmenizi tavsiye ederim. Çoğu küçük ve orta ölçekli proje için ücretsiz tier fazlasıyla yeterli. İyi monitoring’ler!

Apollo Studio ile API Performans İzleme ve Analiz