Grafana Cloud Engineer

Dallas, TX
Contracted
Experienced

Job Title: Grafana Cloud Engineer
Location: Dallas, TX (Hybrid)
Duration: 12 Months


Job Summary

We are seeking an experienced Grafana Cloud Engineer with strong expertise in designing, implementing, and optimizing enterprise observability solutions using Grafana Cloud. The ideal candidate will have hands-on experience with metrics, logs, traces, dashboards, alerting, and cloud-native monitoring architectures.


Key Responsibilities

Grafana Cloud Implementation & Administration

  • Design, deploy, and manage Grafana Cloud environments for enterprise monitoring solutions.

  • Configure and manage data sources such as Prometheus, Loki, Tempo, InfluxDB, Elasticsearch, and cloud monitoring tools.

  • Integrate monitoring platforms with AWS CloudWatch, Azure Monitor, and GCP Operations Suite.

  • Implement secure authentication and authorization using SSO, OAuth, LDAP, or Azure AD.

  • Optimize Grafana architecture for scalability, performance, and cost efficiency.


Observability & Monitoring Architecture

  • Design and implement end-to-end observability stacks covering metrics, logs, and distributed tracing.

  • Develop monitoring strategies aligned with SRE and DevOps practices (SLIs, SLOs, SLAs).

  • Configure alerting systems using Grafana Alerting, Alertmanager, and Loki alerts with escalation workflows.


Dashboarding & Visualization

  • Develop custom Grafana dashboards for application and infrastructure monitoring.

  • Translate business and technical monitoring requirements into actionable visualizations.

  • Standardize dashboard templates and monitoring frameworks across teams.


Integration & Automation

  • Integrate Grafana Cloud with Kubernetes clusters, CI/CD pipelines, and cloud platforms.

  • Automate observability deployments using Terraform, Helm, Ansible, or GitOps workflows.

  • Implement OpenTelemetry instrumentation for distributed tracing.


Troubleshooting & Optimization

  • Diagnose and resolve issues related to metrics, logs, traces, and data source configurations.

  • Optimize queries using PromQL, LogQL, and SQL for performance and cost efficiency.

  • Ensure high availability and reliability of monitoring platforms.


Documentation & Knowledge Sharing

  • Develop architecture documentation, runbooks, and best practice guides.

  • Train internal teams on Grafana dashboards, alerting, and observability practices.

  • Act as a Subject Matter Expert (SME) for Grafana Cloud initiatives.


Required Qualifications

  • 5+ years of experience implementing and managing Grafana or Grafana Cloud.

  • Strong hands-on experience with:

    • Grafana Mimir

    • Loki

    • Tempo

    • Prometheus

    • Alertmanager

  • Experience with PromQL, LogQL, and SQL query optimization.

  • Hands-on experience implementing OpenTelemetry instrumentation.

  • Experience with cloud platforms such as AWS, Azure, or GCP.

  • Strong knowledge of Kubernetes, microservices architecture, and cloud-native observability.

  • Experience with DevOps automation tools including Terraform, Helm, Git, and CI/CD pipelines.

  • Strong understanding of SRE principles, monitoring design, and performance engineering.

  • Excellent collaboration skills for working with cross-functional teams.


Preferred Qualifications

  • Grafana Cloud or Prometheus certifications.

  • Experience designing multi-tenant monitoring solutions.

  • Integration experience with ServiceNow, PagerDuty, Opsgenie, or Jira.

  • Knowledge of RBAC, security frameworks, and compliance standards.

  • Experience with other observability tools such as Splunk or New Relic.

  • Ability to work independently in fast-paced environments and support hands-on implementation.

Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*