simplex

Enhance CLI Productivity with AI

[📡 What is it?]

Fabric is an open-source framework designed to augment humans using AI.

It simplifies the process of integrating large language models (LLMs) into command-line workflows by providing a modular framework for solving specific problems with crowdsourced sets of AI prompts that can be used anywhere.

Fabric was created by Daniel Miessler in January 2024.


Purpose and Key Features

The primary goal is to make AI tools more accessible to the broader community, particularly developers, system administrators, and other command-line heroes who want to integrate Generative AI into their workflows efficiently.

  • Simplification: Easy for users to leverage LLMs without having to develop their own frameworks or wrappers.
  • Modularity: Designed as a collection of modular patterns, allowing users to integrate various LLMs seamlessly into their command-line workflows.
  • Crowdsourced Prompts: Diverse set of AI system prompts contributed by the community.

Think of getting a deep summary of a youtube video that you are interested in, but don’t have the cycles to watch. Checkout the Extract Wisdom and Extract Instructions patterns.

Here is an example output from the Extract Wisdom pattern used on the Fabric Origin Story Youtube video fabrid-origin


🛠 [Problems Solved]

  • Development: Developers can quickly integrate LLMs into their projects without the need to build custom wrappers.
  • System Administration: System administrators can automate routine tasks such as documentation summarization, code generation, and troubleshooting using Fabric and never leaving the command line or by calling from already established automation pipelines.
  • IT Professionals: Command-line Heroes can leverage Fabric for various use cases, including data analysis, automation scripts, and more.

📈 [Installation]

This guide will walk you through each step with detailed commands and ensure Fabric is ready for use in your command-line workflows on a Fedora 41 laptop.


Real-World Use Cases

Automating Daily Tasks

I am often facilitating working sessions and doing discovery at the same time and sometimes these meetings can last 2 or 3 hours.

I’ve learned that I miss a lot of facts and important information in real-time, after either re-watching or reading through the meeting transcript afterwards, because I’m usually making plans or recommendations for the customer.

Now, instead of taking potentially 6+ hours to meet with the customer or team, collate all the data that was taken in, and reply with at least a summary.

I can do this in just moments after the meeting is ended and the transcript is automatically created using a small 8b parameter model that is trained and prompted for this specific task, all while keeping the data secure and local.

Here is an example using a community meeting transcript

$ cat community_improvements_initative_march.txt | fabric --stream --pattern summarize_meeting

```markdown
* Meeting Transcript: Community Improvements Initiative
* Date: March 22, 2023
* Time: 2:00 PM - 3:00 PM

**Attendees**
* Jane Smith, Community Outreach Manager
* John Doe, City Councilor
* Sarah Johnson, Local Business Owner
* Tom Brown, Community Advocate

**Meeting Summary**
The meeting was called to discuss community improvements and initiatives. The discussion lasted for one hour and approximately 5,000 words were spoken during that time.

* Jane Smith: "We believe that our community can be improved by investing in local businesses and supporting their growth. This will not only benefit the business owners but also create jobs and stimulate economic growth."

* John Doe: "I agree with Jane's statement. However, we must also consider the needs of our residents. We should invest in public transportation and infrastructure to make it easier for people to access basic services."

* Sarah Johnson: "As a local business owner, I can attest that investing in businesses is crucial. But we must also consider the environmental impact of our actions. We should aim to create sustainable practices throughout our community."

* Tom Brown: "I appreciate everyone's input. However, we must prioritize the needs of our most vulnerable residents. We should provide more resources for mental health services and address the issue of homelessness in our community."

**Action Items**

* Develop a plan to invest in local businesses.
* Research and recommend public transportation options.
* Create a sustainability committee to promote environmentally friendly practices.

**Next Steps**
The meeting was adjourned with the understanding that the attendees would review the minutes and provide feedback within the next 48 hours.

Enhancing Workflow Efficiency

I can continue to leverage Fabric and my local language model to further this workflow’s efficiency by using pipes, something in every command-line hero’s toolkit.

Taking the output from the previous model’s response and generative summarization of the meeting transcript, we can pipe to another pattern and or even another LLM all together.

Here I’m going to summarize the meeting transcript, then pass it to another pattern that will create a formal email and save it as a markdown file.

$ cat community_improvements_initative_march.txt | fabric -sp summarize_meeting | fabric -p create_formal_email -o email-response.md
# GREETING
Dear [Title] [Last Name],

# INTRODUCTION

I am writing to follow up on the community development meeting that took place recently. The group discussed various initiatives for community improvement, which I believe are crucial for our town's growth and prosperity.

# BODY

Some of the key discussion points included investing in local businesses, public transportation, environmental sustainability, and addressing homelessness. These topics require careful consideration to ensure we allocate resources effectively.

In response to these discussions, the following action items were assigned:

Develop a plan to invest in local businesses.
Research public transportation options.
Create a sustainability committee.

I would appreciate it if you could review the minutes of our meeting and provide feedback within 48 hours. This will help us move forward with implementing the discussed initiatives.


# CLOSING

Thank you for your time and consideration, and I look forward to hearing back from you soon.

Best regards,

[Your Name]

Collaboration and Sharing

This is a very simple example, however, I think we can all see the power built-in to this workflow. The ability to chain or pipe both input and output to different Generative AI models, but we can also, apply different patterns or prompts to individual models and datasets.

Imagine a workflow that could pick up any file that lands in a specific directory. Could be a text note document, a website link, an image, or youtube video. A job picks up this file and applies a pattern to process the data using Fabric and a LLM that outputs multiple data files that represent user stories, tasks and instructions, deep analysis on the content of the data, or even just a description. Multiple outputs with multiple pipe options to enhance or extend the workflow and essentially the manipulation of the data using Generative AI.

I invite you to explore some of these patterns I’ve been creating and testing. Fabric Patterns

Conclusion

The intent is to simplify the integration of Generative AI LLMs into command-line workflows, making it more accessible and practical for developers, system administrators, and other command-line heroes. With Youtube integration, options for URL scraping, and crowdsourced system prompts ensure that users can benefit from a wide range of AI capabilities without extensive development efforts.

5 min read

Post Title Here

[📡 Title or Main Idea]

Insert a catchy tagline or a brief quote summarizing the post.


Quick Summary

Main Topic: Briefly describe what the post is about. Key Features: Highlight any significant elements (equipment, tools, events). Outcome: Summarize the result or main takeaway.


🛠 [Section 1: Setup or Context]

Explain the context or background information. Discuss why this topic is important and what you aimed to accomplish.


📈 [Section 2: Analysis or Results]

Provide a detailed explanation of your findings or the results of your project. Use images, tables, or bullet points to make it easy to follow.

Example:

Scenario 1: Low Power Setup

  • Description of the setup
  • Observations or results

Scenario 2: High Power Setup

  • Description of the setup
  • Observations or results
![Image 4: Example Image](../images/me-medano-pass.jpg) _Optional Caption for Image_ NOTE: This is an important note about the image or any other content that should be highlighted.
~1 min read

Simplex Ops: QTH Station

Fall of late 2023

Having been moved out of the city and back to my roots in the country, I’ve been finding a few minutes here and there to work on and build up my QTH VHF/UHF station.

~1 min read

DIY: 6 Meter Coax Antenna

My DIY 6 Meter Coax Antenna

Summer 2022 is almost here and I’ve been hearing about the ‘Magic Band’ and how 6 meters can be used to make regional contacts as opposed to just local contacts I usually make on 2  meter simplex.  So that means fun in the short term, but long term operating on 6 meters can come in handy during emergency situations.  

I have a TYT TH-9800D Quad band radio that can rx/tx on 6 meters pushing up to 50 watts, all I need now is an antenna.  What I have to work with is 50 feet of RG-58 coax cable and here is how I made my 6 meter antenna using that cable.

2 min read

Simplex Ops: QTH Station

Summer of ‘22

Being recently licensed and still getting used to talking on the radio I started working on setting up a base station to practice checking into local Nets on 2 meters. In the DFW area there are nets multiple times a day I could potentially check-into.

1 min read
Back to Top ↑

ham radio

Simplex Ops: QTH Station

Fall of late 2023

Having been moved out of the city and back to my roots in the country, I’ve been finding a few minutes here and there to work on and build up my QTH VHF/UHF station.

~1 min read

DIY: 6 Meter Coax Antenna

My DIY 6 Meter Coax Antenna

Summer 2022 is almost here and I’ve been hearing about the ‘Magic Band’ and how 6 meters can be used to make regional contacts as opposed to just local contacts I usually make on 2  meter simplex.  So that means fun in the short term, but long term operating on 6 meters can come in handy during emergency situations.  

I have a TYT TH-9800D Quad band radio that can rx/tx on 6 meters pushing up to 50 watts, all I need now is an antenna.  What I have to work with is 50 feet of RG-58 coax cable and here is how I made my 6 meter antenna using that cable.

2 min read

Simplex Ops: QTH Station

Summer of ‘22

Being recently licensed and still getting used to talking on the radio I started working on setting up a base station to practice checking into local Nets on 2 meters. In the DFW area there are nets multiple times a day I could potentially check-into.

1 min read

Who am I?

Hi there! I’m a Red Hat Architect by day, working on supported and enterprise-level open-source software. But when I’m not automating infrastructure provisioning or evangelizing GitOps strategies, you can find me outdoors, gazing at the sky and promoting the art of amateur radio.

1 min read
Back to Top ↑

Antenna

Enhance CLI Productivity with AI

[📡 What is it?]

Fabric is an open-source framework designed to augment humans using AI.

It simplifies the process of integrating large language models (LLMs) into command-line workflows by providing a modular framework for solving specific problems with crowdsourced sets of AI prompts that can be used anywhere.

Fabric was created by Daniel Miessler in January 2024.


Purpose and Key Features

The primary goal is to make AI tools more accessible to the broader community, particularly developers, system administrators, and other command-line heroes who want to integrate Generative AI into their workflows efficiently.

  • Simplification: Easy for users to leverage LLMs without having to develop their own frameworks or wrappers.
  • Modularity: Designed as a collection of modular patterns, allowing users to integrate various LLMs seamlessly into their command-line workflows.
  • Crowdsourced Prompts: Diverse set of AI system prompts contributed by the community.

Think of getting a deep summary of a youtube video that you are interested in, but don’t have the cycles to watch. Checkout the Extract Wisdom and Extract Instructions patterns.

Here is an example output from the Extract Wisdom pattern used on the Fabric Origin Story Youtube video fabrid-origin


🛠 [Problems Solved]

  • Development: Developers can quickly integrate LLMs into their projects without the need to build custom wrappers.
  • System Administration: System administrators can automate routine tasks such as documentation summarization, code generation, and troubleshooting using Fabric and never leaving the command line or by calling from already established automation pipelines.
  • IT Professionals: Command-line Heroes can leverage Fabric for various use cases, including data analysis, automation scripts, and more.

📈 [Installation]

This guide will walk you through each step with detailed commands and ensure Fabric is ready for use in your command-line workflows on a Fedora 41 laptop.


Real-World Use Cases

Automating Daily Tasks

I am often facilitating working sessions and doing discovery at the same time and sometimes these meetings can last 2 or 3 hours.

I’ve learned that I miss a lot of facts and important information in real-time, after either re-watching or reading through the meeting transcript afterwards, because I’m usually making plans or recommendations for the customer.

Now, instead of taking potentially 6+ hours to meet with the customer or team, collate all the data that was taken in, and reply with at least a summary.

I can do this in just moments after the meeting is ended and the transcript is automatically created using a small 8b parameter model that is trained and prompted for this specific task, all while keeping the data secure and local.

Here is an example using a community meeting transcript

$ cat community_improvements_initative_march.txt | fabric --stream --pattern summarize_meeting

```markdown
* Meeting Transcript: Community Improvements Initiative
* Date: March 22, 2023
* Time: 2:00 PM - 3:00 PM

**Attendees**
* Jane Smith, Community Outreach Manager
* John Doe, City Councilor
* Sarah Johnson, Local Business Owner
* Tom Brown, Community Advocate

**Meeting Summary**
The meeting was called to discuss community improvements and initiatives. The discussion lasted for one hour and approximately 5,000 words were spoken during that time.

* Jane Smith: "We believe that our community can be improved by investing in local businesses and supporting their growth. This will not only benefit the business owners but also create jobs and stimulate economic growth."

* John Doe: "I agree with Jane's statement. However, we must also consider the needs of our residents. We should invest in public transportation and infrastructure to make it easier for people to access basic services."

* Sarah Johnson: "As a local business owner, I can attest that investing in businesses is crucial. But we must also consider the environmental impact of our actions. We should aim to create sustainable practices throughout our community."

* Tom Brown: "I appreciate everyone's input. However, we must prioritize the needs of our most vulnerable residents. We should provide more resources for mental health services and address the issue of homelessness in our community."

**Action Items**

* Develop a plan to invest in local businesses.
* Research and recommend public transportation options.
* Create a sustainability committee to promote environmentally friendly practices.

**Next Steps**
The meeting was adjourned with the understanding that the attendees would review the minutes and provide feedback within the next 48 hours.

Enhancing Workflow Efficiency

I can continue to leverage Fabric and my local language model to further this workflow’s efficiency by using pipes, something in every command-line hero’s toolkit.

Taking the output from the previous model’s response and generative summarization of the meeting transcript, we can pipe to another pattern and or even another LLM all together.

Here I’m going to summarize the meeting transcript, then pass it to another pattern that will create a formal email and save it as a markdown file.

$ cat community_improvements_initative_march.txt | fabric -sp summarize_meeting | fabric -p create_formal_email -o email-response.md
# GREETING
Dear [Title] [Last Name],

# INTRODUCTION

I am writing to follow up on the community development meeting that took place recently. The group discussed various initiatives for community improvement, which I believe are crucial for our town's growth and prosperity.

# BODY

Some of the key discussion points included investing in local businesses, public transportation, environmental sustainability, and addressing homelessness. These topics require careful consideration to ensure we allocate resources effectively.

In response to these discussions, the following action items were assigned:

Develop a plan to invest in local businesses.
Research public transportation options.
Create a sustainability committee.

I would appreciate it if you could review the minutes of our meeting and provide feedback within 48 hours. This will help us move forward with implementing the discussed initiatives.


# CLOSING

Thank you for your time and consideration, and I look forward to hearing back from you soon.

Best regards,

[Your Name]

Collaboration and Sharing

This is a very simple example, however, I think we can all see the power built-in to this workflow. The ability to chain or pipe both input and output to different Generative AI models, but we can also, apply different patterns or prompts to individual models and datasets.

Imagine a workflow that could pick up any file that lands in a specific directory. Could be a text note document, a website link, an image, or youtube video. A job picks up this file and applies a pattern to process the data using Fabric and a LLM that outputs multiple data files that represent user stories, tasks and instructions, deep analysis on the content of the data, or even just a description. Multiple outputs with multiple pipe options to enhance or extend the workflow and essentially the manipulation of the data using Generative AI.

I invite you to explore some of these patterns I’ve been creating and testing. Fabric Patterns

Conclusion

The intent is to simplify the integration of Generative AI LLMs into command-line workflows, making it more accessible and practical for developers, system administrators, and other command-line heroes. With Youtube integration, options for URL scraping, and crowdsourced system prompts ensure that users can benefit from a wide range of AI capabilities without extensive development efforts.

5 min read

K6ARK QRP Antenna Build

[📡 QRP Antenna Build from K6ARK]

Mixing QRP radio waves with camping stays and hiking days with this little antenna build!


Quick Summary

Main Topic: Pictures and steps from my experience building this qrp antenna from k6ark.
Key Features: Super tiny components and a rookie at soldering what could go wrong?
Outcome: Multi-band resonate low power antenna that is lightweight and easily deployable.

2 min read

Post Title Here

[📡 Title or Main Idea]

Insert a catchy tagline or a brief quote summarizing the post.


Quick Summary

Main Topic: Briefly describe what the post is about. Key Features: Highlight any significant elements (equipment, tools, events). Outcome: Summarize the result or main takeaway.


🛠 [Section 1: Setup or Context]

Explain the context or background information. Discuss why this topic is important and what you aimed to accomplish.


📈 [Section 2: Analysis or Results]

Provide a detailed explanation of your findings or the results of your project. Use images, tables, or bullet points to make it easy to follow.

Example:

Scenario 1: Low Power Setup

  • Description of the setup
  • Observations or results

Scenario 2: High Power Setup

  • Description of the setup
  • Observations or results
![Image 4: Example Image](../images/me-medano-pass.jpg) _Optional Caption for Image_ NOTE: This is an important note about the image or any other content that should be highlighted.
~1 min read
Back to Top ↑

HF

Enhance CLI Productivity with AI

[📡 What is it?]

Fabric is an open-source framework designed to augment humans using AI.

It simplifies the process of integrating large language models (LLMs) into command-line workflows by providing a modular framework for solving specific problems with crowdsourced sets of AI prompts that can be used anywhere.

Fabric was created by Daniel Miessler in January 2024.


Purpose and Key Features

The primary goal is to make AI tools more accessible to the broader community, particularly developers, system administrators, and other command-line heroes who want to integrate Generative AI into their workflows efficiently.

  • Simplification: Easy for users to leverage LLMs without having to develop their own frameworks or wrappers.
  • Modularity: Designed as a collection of modular patterns, allowing users to integrate various LLMs seamlessly into their command-line workflows.
  • Crowdsourced Prompts: Diverse set of AI system prompts contributed by the community.

Think of getting a deep summary of a youtube video that you are interested in, but don’t have the cycles to watch. Checkout the Extract Wisdom and Extract Instructions patterns.

Here is an example output from the Extract Wisdom pattern used on the Fabric Origin Story Youtube video fabrid-origin


🛠 [Problems Solved]

  • Development: Developers can quickly integrate LLMs into their projects without the need to build custom wrappers.
  • System Administration: System administrators can automate routine tasks such as documentation summarization, code generation, and troubleshooting using Fabric and never leaving the command line or by calling from already established automation pipelines.
  • IT Professionals: Command-line Heroes can leverage Fabric for various use cases, including data analysis, automation scripts, and more.

📈 [Installation]

This guide will walk you through each step with detailed commands and ensure Fabric is ready for use in your command-line workflows on a Fedora 41 laptop.


Real-World Use Cases

Automating Daily Tasks

I am often facilitating working sessions and doing discovery at the same time and sometimes these meetings can last 2 or 3 hours.

I’ve learned that I miss a lot of facts and important information in real-time, after either re-watching or reading through the meeting transcript afterwards, because I’m usually making plans or recommendations for the customer.

Now, instead of taking potentially 6+ hours to meet with the customer or team, collate all the data that was taken in, and reply with at least a summary.

I can do this in just moments after the meeting is ended and the transcript is automatically created using a small 8b parameter model that is trained and prompted for this specific task, all while keeping the data secure and local.

Here is an example using a community meeting transcript

$ cat community_improvements_initative_march.txt | fabric --stream --pattern summarize_meeting

```markdown
* Meeting Transcript: Community Improvements Initiative
* Date: March 22, 2023
* Time: 2:00 PM - 3:00 PM

**Attendees**
* Jane Smith, Community Outreach Manager
* John Doe, City Councilor
* Sarah Johnson, Local Business Owner
* Tom Brown, Community Advocate

**Meeting Summary**
The meeting was called to discuss community improvements and initiatives. The discussion lasted for one hour and approximately 5,000 words were spoken during that time.

* Jane Smith: "We believe that our community can be improved by investing in local businesses and supporting their growth. This will not only benefit the business owners but also create jobs and stimulate economic growth."

* John Doe: "I agree with Jane's statement. However, we must also consider the needs of our residents. We should invest in public transportation and infrastructure to make it easier for people to access basic services."

* Sarah Johnson: "As a local business owner, I can attest that investing in businesses is crucial. But we must also consider the environmental impact of our actions. We should aim to create sustainable practices throughout our community."

* Tom Brown: "I appreciate everyone's input. However, we must prioritize the needs of our most vulnerable residents. We should provide more resources for mental health services and address the issue of homelessness in our community."

**Action Items**

* Develop a plan to invest in local businesses.
* Research and recommend public transportation options.
* Create a sustainability committee to promote environmentally friendly practices.

**Next Steps**
The meeting was adjourned with the understanding that the attendees would review the minutes and provide feedback within the next 48 hours.

Enhancing Workflow Efficiency

I can continue to leverage Fabric and my local language model to further this workflow’s efficiency by using pipes, something in every command-line hero’s toolkit.

Taking the output from the previous model’s response and generative summarization of the meeting transcript, we can pipe to another pattern and or even another LLM all together.

Here I’m going to summarize the meeting transcript, then pass it to another pattern that will create a formal email and save it as a markdown file.

$ cat community_improvements_initative_march.txt | fabric -sp summarize_meeting | fabric -p create_formal_email -o email-response.md
# GREETING
Dear [Title] [Last Name],

# INTRODUCTION

I am writing to follow up on the community development meeting that took place recently. The group discussed various initiatives for community improvement, which I believe are crucial for our town's growth and prosperity.

# BODY

Some of the key discussion points included investing in local businesses, public transportation, environmental sustainability, and addressing homelessness. These topics require careful consideration to ensure we allocate resources effectively.

In response to these discussions, the following action items were assigned:

Develop a plan to invest in local businesses.
Research public transportation options.
Create a sustainability committee.

I would appreciate it if you could review the minutes of our meeting and provide feedback within 48 hours. This will help us move forward with implementing the discussed initiatives.


# CLOSING

Thank you for your time and consideration, and I look forward to hearing back from you soon.

Best regards,

[Your Name]

Collaboration and Sharing

This is a very simple example, however, I think we can all see the power built-in to this workflow. The ability to chain or pipe both input and output to different Generative AI models, but we can also, apply different patterns or prompts to individual models and datasets.

Imagine a workflow that could pick up any file that lands in a specific directory. Could be a text note document, a website link, an image, or youtube video. A job picks up this file and applies a pattern to process the data using Fabric and a LLM that outputs multiple data files that represent user stories, tasks and instructions, deep analysis on the content of the data, or even just a description. Multiple outputs with multiple pipe options to enhance or extend the workflow and essentially the manipulation of the data using Generative AI.

I invite you to explore some of these patterns I’ve been creating and testing. Fabric Patterns

Conclusion

The intent is to simplify the integration of Generative AI LLMs into command-line workflows, making it more accessible and practical for developers, system administrators, and other command-line heroes. With Youtube integration, options for URL scraping, and crowdsourced system prompts ensure that users can benefit from a wide range of AI capabilities without extensive development efforts.

5 min read

K6ARK QRP Antenna Build

[📡 QRP Antenna Build from K6ARK]

Mixing QRP radio waves with camping stays and hiking days with this little antenna build!


Quick Summary

Main Topic: Pictures and steps from my experience building this qrp antenna from k6ark.
Key Features: Super tiny components and a rookie at soldering what could go wrong?
Outcome: Multi-band resonate low power antenna that is lightweight and easily deployable.

2 min read

Post Title Here

[📡 Title or Main Idea]

Insert a catchy tagline or a brief quote summarizing the post.


Quick Summary

Main Topic: Briefly describe what the post is about. Key Features: Highlight any significant elements (equipment, tools, events). Outcome: Summarize the result or main takeaway.


🛠 [Section 1: Setup or Context]

Explain the context or background information. Discuss why this topic is important and what you aimed to accomplish.


📈 [Section 2: Analysis or Results]

Provide a detailed explanation of your findings or the results of your project. Use images, tables, or bullet points to make it easy to follow.

Example:

Scenario 1: Low Power Setup

  • Description of the setup
  • Observations or results

Scenario 2: High Power Setup

  • Description of the setup
  • Observations or results
![Image 4: Example Image](../images/me-medano-pass.jpg) _Optional Caption for Image_ NOTE: This is an important note about the image or any other content that should be highlighted.
~1 min read
Back to Top ↑

MobOps

Enhance CLI Productivity with AI

[📡 What is it?]

Fabric is an open-source framework designed to augment humans using AI.

It simplifies the process of integrating large language models (LLMs) into command-line workflows by providing a modular framework for solving specific problems with crowdsourced sets of AI prompts that can be used anywhere.

Fabric was created by Daniel Miessler in January 2024.


Purpose and Key Features

The primary goal is to make AI tools more accessible to the broader community, particularly developers, system administrators, and other command-line heroes who want to integrate Generative AI into their workflows efficiently.

  • Simplification: Easy for users to leverage LLMs without having to develop their own frameworks or wrappers.
  • Modularity: Designed as a collection of modular patterns, allowing users to integrate various LLMs seamlessly into their command-line workflows.
  • Crowdsourced Prompts: Diverse set of AI system prompts contributed by the community.

Think of getting a deep summary of a youtube video that you are interested in, but don’t have the cycles to watch. Checkout the Extract Wisdom and Extract Instructions patterns.

Here is an example output from the Extract Wisdom pattern used on the Fabric Origin Story Youtube video fabrid-origin


🛠 [Problems Solved]

  • Development: Developers can quickly integrate LLMs into their projects without the need to build custom wrappers.
  • System Administration: System administrators can automate routine tasks such as documentation summarization, code generation, and troubleshooting using Fabric and never leaving the command line or by calling from already established automation pipelines.
  • IT Professionals: Command-line Heroes can leverage Fabric for various use cases, including data analysis, automation scripts, and more.

📈 [Installation]

This guide will walk you through each step with detailed commands and ensure Fabric is ready for use in your command-line workflows on a Fedora 41 laptop.


Real-World Use Cases

Automating Daily Tasks

I am often facilitating working sessions and doing discovery at the same time and sometimes these meetings can last 2 or 3 hours.

I’ve learned that I miss a lot of facts and important information in real-time, after either re-watching or reading through the meeting transcript afterwards, because I’m usually making plans or recommendations for the customer.

Now, instead of taking potentially 6+ hours to meet with the customer or team, collate all the data that was taken in, and reply with at least a summary.

I can do this in just moments after the meeting is ended and the transcript is automatically created using a small 8b parameter model that is trained and prompted for this specific task, all while keeping the data secure and local.

Here is an example using a community meeting transcript

$ cat community_improvements_initative_march.txt | fabric --stream --pattern summarize_meeting

```markdown
* Meeting Transcript: Community Improvements Initiative
* Date: March 22, 2023
* Time: 2:00 PM - 3:00 PM

**Attendees**
* Jane Smith, Community Outreach Manager
* John Doe, City Councilor
* Sarah Johnson, Local Business Owner
* Tom Brown, Community Advocate

**Meeting Summary**
The meeting was called to discuss community improvements and initiatives. The discussion lasted for one hour and approximately 5,000 words were spoken during that time.

* Jane Smith: "We believe that our community can be improved by investing in local businesses and supporting their growth. This will not only benefit the business owners but also create jobs and stimulate economic growth."

* John Doe: "I agree with Jane's statement. However, we must also consider the needs of our residents. We should invest in public transportation and infrastructure to make it easier for people to access basic services."

* Sarah Johnson: "As a local business owner, I can attest that investing in businesses is crucial. But we must also consider the environmental impact of our actions. We should aim to create sustainable practices throughout our community."

* Tom Brown: "I appreciate everyone's input. However, we must prioritize the needs of our most vulnerable residents. We should provide more resources for mental health services and address the issue of homelessness in our community."

**Action Items**

* Develop a plan to invest in local businesses.
* Research and recommend public transportation options.
* Create a sustainability committee to promote environmentally friendly practices.

**Next Steps**
The meeting was adjourned with the understanding that the attendees would review the minutes and provide feedback within the next 48 hours.

Enhancing Workflow Efficiency

I can continue to leverage Fabric and my local language model to further this workflow’s efficiency by using pipes, something in every command-line hero’s toolkit.

Taking the output from the previous model’s response and generative summarization of the meeting transcript, we can pipe to another pattern and or even another LLM all together.

Here I’m going to summarize the meeting transcript, then pass it to another pattern that will create a formal email and save it as a markdown file.

$ cat community_improvements_initative_march.txt | fabric -sp summarize_meeting | fabric -p create_formal_email -o email-response.md
# GREETING
Dear [Title] [Last Name],

# INTRODUCTION

I am writing to follow up on the community development meeting that took place recently. The group discussed various initiatives for community improvement, which I believe are crucial for our town's growth and prosperity.

# BODY

Some of the key discussion points included investing in local businesses, public transportation, environmental sustainability, and addressing homelessness. These topics require careful consideration to ensure we allocate resources effectively.

In response to these discussions, the following action items were assigned:

Develop a plan to invest in local businesses.
Research public transportation options.
Create a sustainability committee.

I would appreciate it if you could review the minutes of our meeting and provide feedback within 48 hours. This will help us move forward with implementing the discussed initiatives.


# CLOSING

Thank you for your time and consideration, and I look forward to hearing back from you soon.

Best regards,

[Your Name]

Collaboration and Sharing

This is a very simple example, however, I think we can all see the power built-in to this workflow. The ability to chain or pipe both input and output to different Generative AI models, but we can also, apply different patterns or prompts to individual models and datasets.

Imagine a workflow that could pick up any file that lands in a specific directory. Could be a text note document, a website link, an image, or youtube video. A job picks up this file and applies a pattern to process the data using Fabric and a LLM that outputs multiple data files that represent user stories, tasks and instructions, deep analysis on the content of the data, or even just a description. Multiple outputs with multiple pipe options to enhance or extend the workflow and essentially the manipulation of the data using Generative AI.

I invite you to explore some of these patterns I’ve been creating and testing. Fabric Patterns

Conclusion

The intent is to simplify the integration of Generative AI LLMs into command-line workflows, making it more accessible and practical for developers, system administrators, and other command-line heroes. With Youtube integration, options for URL scraping, and crowdsourced system prompts ensure that users can benefit from a wide range of AI capabilities without extensive development efforts.

5 min read

Post Title Here

[📡 Title or Main Idea]

Insert a catchy tagline or a brief quote summarizing the post.


Quick Summary

Main Topic: Briefly describe what the post is about. Key Features: Highlight any significant elements (equipment, tools, events). Outcome: Summarize the result or main takeaway.


🛠 [Section 1: Setup or Context]

Explain the context or background information. Discuss why this topic is important and what you aimed to accomplish.


📈 [Section 2: Analysis or Results]

Provide a detailed explanation of your findings or the results of your project. Use images, tables, or bullet points to make it easy to follow.

Example:

Scenario 1: Low Power Setup

  • Description of the setup
  • Observations or results

Scenario 2: High Power Setup

  • Description of the setup
  • Observations or results
![Image 4: Example Image](../images/me-medano-pass.jpg) _Optional Caption for Image_ NOTE: This is an important note about the image or any other content that should be highlighted.
~1 min read

Who am I?

Hi there! I’m a Red Hat Architect by day, working on supported and enterprise-level open-source software. But when I’m not automating infrastructure provisioning or evangelizing GitOps strategies, you can find me outdoors, gazing at the sky and promoting the art of amateur radio.

1 min read
Back to Top ↑

ISS

Expedition 72 - Series 23 Holidays 2024

📡 SSTV from Space

*To celebrate the highlights from 2024 of Amateur Radio in Space the ARISS put on “Expedition 72 - Series 23 Holiday 2024” from 12/24/2024 to 01/05/2024 on 145.800 transmitting a series of Slow Scan TV images.

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

1 min read

40 Years of Amateur Radio on Human SpaceFlight

📡 SSTV from Space

To celebrate the 40th Anniversary of Amateur Radio in Space I attempt to capture and decode SSTV images that are being transmitted from the ISS - here is what and how I did!

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

2 min read
Back to Top ↑

SSTV

Expedition 72 - Series 23 Holidays 2024

📡 SSTV from Space

*To celebrate the highlights from 2024 of Amateur Radio in Space the ARISS put on “Expedition 72 - Series 23 Holiday 2024” from 12/24/2024 to 01/05/2024 on 145.800 transmitting a series of Slow Scan TV images.

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

1 min read

40 Years of Amateur Radio on Human SpaceFlight

📡 SSTV from Space

To celebrate the 40th Anniversary of Amateur Radio in Space I attempt to capture and decode SSTV images that are being transmitted from the ISS - here is what and how I did!

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

2 min read
Back to Top ↑

Satellite

Expedition 72 - Series 23 Holidays 2024

📡 SSTV from Space

*To celebrate the highlights from 2024 of Amateur Radio in Space the ARISS put on “Expedition 72 - Series 23 Holiday 2024” from 12/24/2024 to 01/05/2024 on 145.800 transmitting a series of Slow Scan TV images.

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

1 min read

40 Years of Amateur Radio on Human SpaceFlight

📡 SSTV from Space

To celebrate the 40th Anniversary of Amateur Radio in Space I attempt to capture and decode SSTV images that are being transmitted from the ISS - here is what and how I did!

Quick Summary

Main Topic: Sharing the SSTV images I was able to decode from the ISS.
Key Features: Share some tips, tricks, and lessons learned.
Outcome: It’s pictures… from space!

2 min read
Back to Top ↑

shack

Simplex Ops: QTH Station

Fall of late 2023

Having been moved out of the city and back to my roots in the country, I’ve been finding a few minutes here and there to work on and build up my QTH VHF/UHF station.

~1 min read

Simplex Ops: QTH Station

Summer of ‘22

Being recently licensed and still getting used to talking on the radio I started working on setting up a base station to practice checking into local Nets on 2 meters. In the DFW area there are nets multiple times a day I could potentially check-into.

1 min read
Back to Top ↑

Personal

Who am I?

Hi there! I’m a Red Hat Architect by day, working on supported and enterprise-level open-source software. But when I’m not automating infrastructure provisioning or evangelizing GitOps strategies, you can find me outdoors, gazing at the sky and promoting the art of amateur radio.

1 min read
Back to Top ↑

ai

0ri0n: My Local Private AI

Operation 0ri0n - Local AI

Recently, I found time to explore a new area and decided to delve into Data Science, specifically Artificial Intelligence and Large Language Models (LLMs).

Standalone AI Vendors

Using public and free AI services like ChatGPT, DeepSeek, and Claude requires awareness of potential privacy and data risks. These platforms may collect user input for training, leading to unintentional sharing of sensitive information. Additionally, their security measures might not be sufficient to prevent unauthorized access or data breaches.

Users should exercise caution when providing personal or confidential details and consider best practices such as encrypting sensitive data and regularly reviewing privacy policies.

Here are a few vendors that offer open-source models to the public:

Remote Private AI

Running LLMs in a private but remote setup, as shown in the GitHub repository, balances local control and scalability by using external servers or cloud resources dedicated to your organization. This approach enhances data privacy compared to public clouds while offering ease of management, performance benefits, and scalable infrastructure for handling larger workloads.

WARNING: This setup can be very expensive.

This pattern provisions infrastructure and integrates GitHub Actions for streamlined automation.

Local Home Lab AI

Running LLMs locally enhances data privacy, improves performance due to reduced network latency, and offers greater flexibility for customization and integration with on-premises systems. This setup also provides better resource control and can be cost-effective, especially for organizations with existing hardware infrastructure.

0ri0n Local AI

My Home Lab Architecture for Operation 0ri0n

Technical Document: Setting Up Open-WebUI and Ollama

To ensure optimal performance when setting up Open-WebUI and Ollama on Windows Subsystem for Linux (WSL) with GPU support, consider the following hardware components:

  1. GPU (Graphics Processing Unit):
    • A powerful NVIDIA GPU is essential for running LLMs efficiently.
      • 0ri0n: Nvidia GeForce RTX 4090 16 GB
  2. CPU (Central Processing Unit):
    • A high-performance CPU with multiple cores and robust architecture.
      • 0ri0n: Intel Core i7-14700F 2.10 GHz
  3. RAM:
    • At least 16GB of RAM, but preferably 32GB or more for smoother operation and faster model loading times.
      • 0ri0n: 32 GB

This setup will help you run Open-WebUI and Ollama effectively on your system.

By choosing Windows 11 and relying on WSL, we leverage the popularity and ease of use of a Windows environment while harnessing the power of Linux. This setup is convenient for highlighting and testing WSL capabilities.

Steps:

From Windows 11, open a PowerShell prompt as Administrator and run:

WSL Installation:

> wsl --install

This command installs Windows Subsystem for Linux (WSL) to provide a lightweight version of Linux on your Windows machine.

You should see a different prompt when WSL finishes starting.

Docker Installation:

$ curl https://get.docker.com | sh

This command downloads and runs a script that automatically installs Docker on the system.

Install NVIDIA Driver for Docker Containers:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo service docker restart

These commands download and add the necessary GPG key, configure the NVIDIA container toolkit repository, install the toolkit, and then restart Docker to use the GPU with Docker containers.

Install Open-WebUI and Ollama:

This command runs a Docker container named ollama using all available GPUs, mounts a volume for persistent storage, exposes port 11434 on both the host and the container, and sets environment variables to enable specific features like flash attention and quantization type.

$ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart=always -e OLLAMA_FLASH_ATTENTION=true -e OLLAMA_KV_CACHE_TYPE=q4_0 -e OLLAMA_HOST=0.0.0.0 ollama/ollama

This command runs a Docker container named open-webui, uses the host network mode for better performance, mounts a volume for data persistence, sets an environment variable with the OLLAMA_BASE_URL, and ensures the container restarts automatically.

NOTE: The environment variables OLLAMA_FLASH_ATTENTION and OLLAMA_KV_CACHE_TYPE enable quantization and context quantization. You can omit these from the command if you encounter issues.

$ docker run -d --network=host -v open-web:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui:main

For a quick test, try this command to see how much GPU RAM is being utilized:

$ nvidia-smi -l

Check out Ollama and Open WebUI for more specifics.

Set Up Static IP on 0ri0n:

I configured a static IP, but this is not necessary, especially if you already have DNS or DHCP implemented in your network.

Edit /etc/netplan/01-netcfg.yaml with the following configuration:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - <IP_ADDRESS>/20
      routes:
        - to: default
          via: 172.20.80.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

This YAML configuration sets up a static IP address for the network interface eth0, assigns it a specific IP, configures the default gateway, and sets DNS servers.

Expose WSL Port in Windows:

Run the following commands in PowerShell as Administrator:

> netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<IP_ADDRESS>
> netsh advfirewall firewall add rule name="ServicePort11434" dir=in action=allow protocol=tcp localport=11434

This opens the firewall port on Windows 11 so that you can access the API provided by Ollama from other devices on your network.

Start WSL on Windows 11 Startup:

This step is optional but ensures that all services return if the desktop reboots.

  • Open Task Scheduler by pressing Win + S, type “Task Scheduler”, and open it.
  • Click on “Create Basic Task” in the right pane, give your task a name and description, then click “Next”.
  • Choose “When the computer starts” as the trigger, then click “Next”.
  • Select “Start a program” as the action, then click “Next”.
  • Browse to C:\Windows\System32\wsl.exe.
  • In the “Add arguments” field, enter --distribution Ubuntu (replace Ubuntu with your distribution name if different).
  • Click “Finish” to create the task.

This process sets up a scheduled task in Windows Task Scheduler to start WSL on system startup.

At this point, you should be able to start inferencing with the models being served or download your first model.

Try accessing Open WebUI @ https://localhost:8080 or whatever your IP is for your Open WebUI Docker instance.

Nginx Proxy

I created a proxy that listens on port 443 and passes the traffic to the Docker container and port 8080 for the Open WebUI GUI.

docker run -d --name nginx -p 443:443 -v ~/conf.d:/etc/nginx/conf.d -v ~/ssl:/etc/nginx/ssl --add-host=host.docker.internal:host-gateway --restart always nginx:alpine
cat << 'EOF' > ~/conf.d/open-webui.conf
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
    listen 443 ssl;
    server_name <ip_address>;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://host.docker.internal:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Timeouts
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;

        # Disable buffering for real-time responses
        proxy_buffering off;
    }
}
EOF

NOTE: Update the code to reflect the correct IP address.

For more info on Nginx

CloudFlare Tunnel:

To access my local model remotely or when away from my home network, I created a CloudFlare zero-trust tunnel. After creating an account and setting up a DNS record, I was given this Docker command with a token to run.

docker run -d cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ******eE9Ea3la**********

This command runs the Cloudflare Docker container in detached mode, enabling a tunnel to route traffic through your machine to services running inside WSL.

Visit Cloudflare for more info on Zero-Trust Tunnels

Pausing Thoughts

Now our setup is complete, and all components are in place for us to:

  • Access the Open WebUI GUI locally and remotely
  • Access Ollama via the CLI locally
  • Leverage the Ollama API locally

In today’s digital age, we constantly navigate between public and private spaces. Striking a balance is key to maintaining control and efficiency.

What is next?

Next, I plan to dive deeper into model specifics around quantization and tuning for efficiency, as well as explore the settings and features in both Ollama and Open WebUI.

6 min read
Back to Top ↑

antenna

DIY: 6 Meter Coax Antenna

My DIY 6 Meter Coax Antenna

Summer 2022 is almost here and I’ve been hearing about the ‘Magic Band’ and how 6 meters can be used to make regional contacts as opposed to just local contacts I usually make on 2  meter simplex.  So that means fun in the short term, but long term operating on 6 meters can come in handy during emergency situations.  

I have a TYT TH-9800D Quad band radio that can rx/tx on 6 meters pushing up to 50 watts, all I need now is an antenna.  What I have to work with is 50 feet of RG-58 coax cable and here is how I made my 6 meter antenna using that cable.

2 min read
Back to Top ↑

cloudflare

0ri0n: My Local Private AI

Operation 0ri0n - Local AI

Recently, I found time to explore a new area and decided to delve into Data Science, specifically Artificial Intelligence and Large Language Models (LLMs).

Standalone AI Vendors

Using public and free AI services like ChatGPT, DeepSeek, and Claude requires awareness of potential privacy and data risks. These platforms may collect user input for training, leading to unintentional sharing of sensitive information. Additionally, their security measures might not be sufficient to prevent unauthorized access or data breaches.

Users should exercise caution when providing personal or confidential details and consider best practices such as encrypting sensitive data and regularly reviewing privacy policies.

Here are a few vendors that offer open-source models to the public:

Remote Private AI

Running LLMs in a private but remote setup, as shown in the GitHub repository, balances local control and scalability by using external servers or cloud resources dedicated to your organization. This approach enhances data privacy compared to public clouds while offering ease of management, performance benefits, and scalable infrastructure for handling larger workloads.

WARNING: This setup can be very expensive.

This pattern provisions infrastructure and integrates GitHub Actions for streamlined automation.

Local Home Lab AI

Running LLMs locally enhances data privacy, improves performance due to reduced network latency, and offers greater flexibility for customization and integration with on-premises systems. This setup also provides better resource control and can be cost-effective, especially for organizations with existing hardware infrastructure.

0ri0n Local AI

My Home Lab Architecture for Operation 0ri0n

Technical Document: Setting Up Open-WebUI and Ollama

To ensure optimal performance when setting up Open-WebUI and Ollama on Windows Subsystem for Linux (WSL) with GPU support, consider the following hardware components:

  1. GPU (Graphics Processing Unit):
    • A powerful NVIDIA GPU is essential for running LLMs efficiently.
      • 0ri0n: Nvidia GeForce RTX 4090 16 GB
  2. CPU (Central Processing Unit):
    • A high-performance CPU with multiple cores and robust architecture.
      • 0ri0n: Intel Core i7-14700F 2.10 GHz
  3. RAM:
    • At least 16GB of RAM, but preferably 32GB or more for smoother operation and faster model loading times.
      • 0ri0n: 32 GB

This setup will help you run Open-WebUI and Ollama effectively on your system.

By choosing Windows 11 and relying on WSL, we leverage the popularity and ease of use of a Windows environment while harnessing the power of Linux. This setup is convenient for highlighting and testing WSL capabilities.

Steps:

From Windows 11, open a PowerShell prompt as Administrator and run:

WSL Installation:

> wsl --install

This command installs Windows Subsystem for Linux (WSL) to provide a lightweight version of Linux on your Windows machine.

You should see a different prompt when WSL finishes starting.

Docker Installation:

$ curl https://get.docker.com | sh

This command downloads and runs a script that automatically installs Docker on the system.

Install NVIDIA Driver for Docker Containers:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo service docker restart

These commands download and add the necessary GPG key, configure the NVIDIA container toolkit repository, install the toolkit, and then restart Docker to use the GPU with Docker containers.

Install Open-WebUI and Ollama:

This command runs a Docker container named ollama using all available GPUs, mounts a volume for persistent storage, exposes port 11434 on both the host and the container, and sets environment variables to enable specific features like flash attention and quantization type.

$ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart=always -e OLLAMA_FLASH_ATTENTION=true -e OLLAMA_KV_CACHE_TYPE=q4_0 -e OLLAMA_HOST=0.0.0.0 ollama/ollama

This command runs a Docker container named open-webui, uses the host network mode for better performance, mounts a volume for data persistence, sets an environment variable with the OLLAMA_BASE_URL, and ensures the container restarts automatically.

NOTE: The environment variables OLLAMA_FLASH_ATTENTION and OLLAMA_KV_CACHE_TYPE enable quantization and context quantization. You can omit these from the command if you encounter issues.

$ docker run -d --network=host -v open-web:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui:main

For a quick test, try this command to see how much GPU RAM is being utilized:

$ nvidia-smi -l

Check out Ollama and Open WebUI for more specifics.

Set Up Static IP on 0ri0n:

I configured a static IP, but this is not necessary, especially if you already have DNS or DHCP implemented in your network.

Edit /etc/netplan/01-netcfg.yaml with the following configuration:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - <IP_ADDRESS>/20
      routes:
        - to: default
          via: 172.20.80.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

This YAML configuration sets up a static IP address for the network interface eth0, assigns it a specific IP, configures the default gateway, and sets DNS servers.

Expose WSL Port in Windows:

Run the following commands in PowerShell as Administrator:

> netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<IP_ADDRESS>
> netsh advfirewall firewall add rule name="ServicePort11434" dir=in action=allow protocol=tcp localport=11434

This opens the firewall port on Windows 11 so that you can access the API provided by Ollama from other devices on your network.

Start WSL on Windows 11 Startup:

This step is optional but ensures that all services return if the desktop reboots.

  • Open Task Scheduler by pressing Win + S, type “Task Scheduler”, and open it.
  • Click on “Create Basic Task” in the right pane, give your task a name and description, then click “Next”.
  • Choose “When the computer starts” as the trigger, then click “Next”.
  • Select “Start a program” as the action, then click “Next”.
  • Browse to C:\Windows\System32\wsl.exe.
  • In the “Add arguments” field, enter --distribution Ubuntu (replace Ubuntu with your distribution name if different).
  • Click “Finish” to create the task.

This process sets up a scheduled task in Windows Task Scheduler to start WSL on system startup.

At this point, you should be able to start inferencing with the models being served or download your first model.

Try accessing Open WebUI @ https://localhost:8080 or whatever your IP is for your Open WebUI Docker instance.

Nginx Proxy

I created a proxy that listens on port 443 and passes the traffic to the Docker container and port 8080 for the Open WebUI GUI.

docker run -d --name nginx -p 443:443 -v ~/conf.d:/etc/nginx/conf.d -v ~/ssl:/etc/nginx/ssl --add-host=host.docker.internal:host-gateway --restart always nginx:alpine
cat << 'EOF' > ~/conf.d/open-webui.conf
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
    listen 443 ssl;
    server_name <ip_address>;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://host.docker.internal:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Timeouts
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;

        # Disable buffering for real-time responses
        proxy_buffering off;
    }
}
EOF

NOTE: Update the code to reflect the correct IP address.

For more info on Nginx

CloudFlare Tunnel:

To access my local model remotely or when away from my home network, I created a CloudFlare zero-trust tunnel. After creating an account and setting up a DNS record, I was given this Docker command with a token to run.

docker run -d cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ******eE9Ea3la**********

This command runs the Cloudflare Docker container in detached mode, enabling a tunnel to route traffic through your machine to services running inside WSL.

Visit Cloudflare for more info on Zero-Trust Tunnels

Pausing Thoughts

Now our setup is complete, and all components are in place for us to:

  • Access the Open WebUI GUI locally and remotely
  • Access Ollama via the CLI locally
  • Leverage the Ollama API locally

In today’s digital age, we constantly navigate between public and private spaces. Striking a balance is key to maintaining control and efficiency.

What is next?

Next, I plan to dive deeper into model specifics around quantization and tuning for efficiency, as well as explore the settings and features in both Ollama and Open WebUI.

6 min read
Back to Top ↑

diy

DIY: 6 Meter Coax Antenna

My DIY 6 Meter Coax Antenna

Summer 2022 is almost here and I’ve been hearing about the ‘Magic Band’ and how 6 meters can be used to make regional contacts as opposed to just local contacts I usually make on 2  meter simplex.  So that means fun in the short term, but long term operating on 6 meters can come in handy during emergency situations.  

I have a TYT TH-9800D Quad band radio that can rx/tx on 6 meters pushing up to 50 watts, all I need now is an antenna.  What I have to work with is 50 feet of RG-58 coax cable and here is how I made my 6 meter antenna using that cable.

2 min read
Back to Top ↑

nginx

0ri0n: My Local Private AI

Operation 0ri0n - Local AI

Recently, I found time to explore a new area and decided to delve into Data Science, specifically Artificial Intelligence and Large Language Models (LLMs).

Standalone AI Vendors

Using public and free AI services like ChatGPT, DeepSeek, and Claude requires awareness of potential privacy and data risks. These platforms may collect user input for training, leading to unintentional sharing of sensitive information. Additionally, their security measures might not be sufficient to prevent unauthorized access or data breaches.

Users should exercise caution when providing personal or confidential details and consider best practices such as encrypting sensitive data and regularly reviewing privacy policies.

Here are a few vendors that offer open-source models to the public:

Remote Private AI

Running LLMs in a private but remote setup, as shown in the GitHub repository, balances local control and scalability by using external servers or cloud resources dedicated to your organization. This approach enhances data privacy compared to public clouds while offering ease of management, performance benefits, and scalable infrastructure for handling larger workloads.

WARNING: This setup can be very expensive.

This pattern provisions infrastructure and integrates GitHub Actions for streamlined automation.

Local Home Lab AI

Running LLMs locally enhances data privacy, improves performance due to reduced network latency, and offers greater flexibility for customization and integration with on-premises systems. This setup also provides better resource control and can be cost-effective, especially for organizations with existing hardware infrastructure.

0ri0n Local AI

My Home Lab Architecture for Operation 0ri0n

Technical Document: Setting Up Open-WebUI and Ollama

To ensure optimal performance when setting up Open-WebUI and Ollama on Windows Subsystem for Linux (WSL) with GPU support, consider the following hardware components:

  1. GPU (Graphics Processing Unit):
    • A powerful NVIDIA GPU is essential for running LLMs efficiently.
      • 0ri0n: Nvidia GeForce RTX 4090 16 GB
  2. CPU (Central Processing Unit):
    • A high-performance CPU with multiple cores and robust architecture.
      • 0ri0n: Intel Core i7-14700F 2.10 GHz
  3. RAM:
    • At least 16GB of RAM, but preferably 32GB or more for smoother operation and faster model loading times.
      • 0ri0n: 32 GB

This setup will help you run Open-WebUI and Ollama effectively on your system.

By choosing Windows 11 and relying on WSL, we leverage the popularity and ease of use of a Windows environment while harnessing the power of Linux. This setup is convenient for highlighting and testing WSL capabilities.

Steps:

From Windows 11, open a PowerShell prompt as Administrator and run:

WSL Installation:

> wsl --install

This command installs Windows Subsystem for Linux (WSL) to provide a lightweight version of Linux on your Windows machine.

You should see a different prompt when WSL finishes starting.

Docker Installation:

$ curl https://get.docker.com | sh

This command downloads and runs a script that automatically installs Docker on the system.

Install NVIDIA Driver for Docker Containers:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo service docker restart

These commands download and add the necessary GPG key, configure the NVIDIA container toolkit repository, install the toolkit, and then restart Docker to use the GPU with Docker containers.

Install Open-WebUI and Ollama:

This command runs a Docker container named ollama using all available GPUs, mounts a volume for persistent storage, exposes port 11434 on both the host and the container, and sets environment variables to enable specific features like flash attention and quantization type.

$ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart=always -e OLLAMA_FLASH_ATTENTION=true -e OLLAMA_KV_CACHE_TYPE=q4_0 -e OLLAMA_HOST=0.0.0.0 ollama/ollama

This command runs a Docker container named open-webui, uses the host network mode for better performance, mounts a volume for data persistence, sets an environment variable with the OLLAMA_BASE_URL, and ensures the container restarts automatically.

NOTE: The environment variables OLLAMA_FLASH_ATTENTION and OLLAMA_KV_CACHE_TYPE enable quantization and context quantization. You can omit these from the command if you encounter issues.

$ docker run -d --network=host -v open-web:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui:main

For a quick test, try this command to see how much GPU RAM is being utilized:

$ nvidia-smi -l

Check out Ollama and Open WebUI for more specifics.

Set Up Static IP on 0ri0n:

I configured a static IP, but this is not necessary, especially if you already have DNS or DHCP implemented in your network.

Edit /etc/netplan/01-netcfg.yaml with the following configuration:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - <IP_ADDRESS>/20
      routes:
        - to: default
          via: 172.20.80.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

This YAML configuration sets up a static IP address for the network interface eth0, assigns it a specific IP, configures the default gateway, and sets DNS servers.

Expose WSL Port in Windows:

Run the following commands in PowerShell as Administrator:

> netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<IP_ADDRESS>
> netsh advfirewall firewall add rule name="ServicePort11434" dir=in action=allow protocol=tcp localport=11434

This opens the firewall port on Windows 11 so that you can access the API provided by Ollama from other devices on your network.

Start WSL on Windows 11 Startup:

This step is optional but ensures that all services return if the desktop reboots.

  • Open Task Scheduler by pressing Win + S, type “Task Scheduler”, and open it.
  • Click on “Create Basic Task” in the right pane, give your task a name and description, then click “Next”.
  • Choose “When the computer starts” as the trigger, then click “Next”.
  • Select “Start a program” as the action, then click “Next”.
  • Browse to C:\Windows\System32\wsl.exe.
  • In the “Add arguments” field, enter --distribution Ubuntu (replace Ubuntu with your distribution name if different).
  • Click “Finish” to create the task.

This process sets up a scheduled task in Windows Task Scheduler to start WSL on system startup.

At this point, you should be able to start inferencing with the models being served or download your first model.

Try accessing Open WebUI @ https://localhost:8080 or whatever your IP is for your Open WebUI Docker instance.

Nginx Proxy

I created a proxy that listens on port 443 and passes the traffic to the Docker container and port 8080 for the Open WebUI GUI.

docker run -d --name nginx -p 443:443 -v ~/conf.d:/etc/nginx/conf.d -v ~/ssl:/etc/nginx/ssl --add-host=host.docker.internal:host-gateway --restart always nginx:alpine
cat << 'EOF' > ~/conf.d/open-webui.conf
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
    listen 443 ssl;
    server_name <ip_address>;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://host.docker.internal:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Timeouts
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;

        # Disable buffering for real-time responses
        proxy_buffering off;
    }
}
EOF

NOTE: Update the code to reflect the correct IP address.

For more info on Nginx

CloudFlare Tunnel:

To access my local model remotely or when away from my home network, I created a CloudFlare zero-trust tunnel. After creating an account and setting up a DNS record, I was given this Docker command with a token to run.

docker run -d cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ******eE9Ea3la**********

This command runs the Cloudflare Docker container in detached mode, enabling a tunnel to route traffic through your machine to services running inside WSL.

Visit Cloudflare for more info on Zero-Trust Tunnels

Pausing Thoughts

Now our setup is complete, and all components are in place for us to:

  • Access the Open WebUI GUI locally and remotely
  • Access Ollama via the CLI locally
  • Leverage the Ollama API locally

In today’s digital age, we constantly navigate between public and private spaces. Striking a balance is key to maintaining control and efficiency.

What is next?

Next, I plan to dive deeper into model specifics around quantization and tuning for efficiency, as well as explore the settings and features in both Ollama and Open WebUI.

6 min read
Back to Top ↑

ollama

0ri0n: My Local Private AI

Operation 0ri0n - Local AI

Recently, I found time to explore a new area and decided to delve into Data Science, specifically Artificial Intelligence and Large Language Models (LLMs).

Standalone AI Vendors

Using public and free AI services like ChatGPT, DeepSeek, and Claude requires awareness of potential privacy and data risks. These platforms may collect user input for training, leading to unintentional sharing of sensitive information. Additionally, their security measures might not be sufficient to prevent unauthorized access or data breaches.

Users should exercise caution when providing personal or confidential details and consider best practices such as encrypting sensitive data and regularly reviewing privacy policies.

Here are a few vendors that offer open-source models to the public:

Remote Private AI

Running LLMs in a private but remote setup, as shown in the GitHub repository, balances local control and scalability by using external servers or cloud resources dedicated to your organization. This approach enhances data privacy compared to public clouds while offering ease of management, performance benefits, and scalable infrastructure for handling larger workloads.

WARNING: This setup can be very expensive.

This pattern provisions infrastructure and integrates GitHub Actions for streamlined automation.

Local Home Lab AI

Running LLMs locally enhances data privacy, improves performance due to reduced network latency, and offers greater flexibility for customization and integration with on-premises systems. This setup also provides better resource control and can be cost-effective, especially for organizations with existing hardware infrastructure.

0ri0n Local AI

My Home Lab Architecture for Operation 0ri0n

Technical Document: Setting Up Open-WebUI and Ollama

To ensure optimal performance when setting up Open-WebUI and Ollama on Windows Subsystem for Linux (WSL) with GPU support, consider the following hardware components:

  1. GPU (Graphics Processing Unit):
    • A powerful NVIDIA GPU is essential for running LLMs efficiently.
      • 0ri0n: Nvidia GeForce RTX 4090 16 GB
  2. CPU (Central Processing Unit):
    • A high-performance CPU with multiple cores and robust architecture.
      • 0ri0n: Intel Core i7-14700F 2.10 GHz
  3. RAM:
    • At least 16GB of RAM, but preferably 32GB or more for smoother operation and faster model loading times.
      • 0ri0n: 32 GB

This setup will help you run Open-WebUI and Ollama effectively on your system.

By choosing Windows 11 and relying on WSL, we leverage the popularity and ease of use of a Windows environment while harnessing the power of Linux. This setup is convenient for highlighting and testing WSL capabilities.

Steps:

From Windows 11, open a PowerShell prompt as Administrator and run:

WSL Installation:

> wsl --install

This command installs Windows Subsystem for Linux (WSL) to provide a lightweight version of Linux on your Windows machine.

You should see a different prompt when WSL finishes starting.

Docker Installation:

$ curl https://get.docker.com | sh

This command downloads and runs a script that automatically installs Docker on the system.

Install NVIDIA Driver for Docker Containers:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo service docker restart

These commands download and add the necessary GPG key, configure the NVIDIA container toolkit repository, install the toolkit, and then restart Docker to use the GPU with Docker containers.

Install Open-WebUI and Ollama:

This command runs a Docker container named ollama using all available GPUs, mounts a volume for persistent storage, exposes port 11434 on both the host and the container, and sets environment variables to enable specific features like flash attention and quantization type.

$ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart=always -e OLLAMA_FLASH_ATTENTION=true -e OLLAMA_KV_CACHE_TYPE=q4_0 -e OLLAMA_HOST=0.0.0.0 ollama/ollama

This command runs a Docker container named open-webui, uses the host network mode for better performance, mounts a volume for data persistence, sets an environment variable with the OLLAMA_BASE_URL, and ensures the container restarts automatically.

NOTE: The environment variables OLLAMA_FLASH_ATTENTION and OLLAMA_KV_CACHE_TYPE enable quantization and context quantization. You can omit these from the command if you encounter issues.

$ docker run -d --network=host -v open-web:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui:main

For a quick test, try this command to see how much GPU RAM is being utilized:

$ nvidia-smi -l

Check out Ollama and Open WebUI for more specifics.

Set Up Static IP on 0ri0n:

I configured a static IP, but this is not necessary, especially if you already have DNS or DHCP implemented in your network.

Edit /etc/netplan/01-netcfg.yaml with the following configuration:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - <IP_ADDRESS>/20
      routes:
        - to: default
          via: 172.20.80.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

This YAML configuration sets up a static IP address for the network interface eth0, assigns it a specific IP, configures the default gateway, and sets DNS servers.

Expose WSL Port in Windows:

Run the following commands in PowerShell as Administrator:

> netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<IP_ADDRESS>
> netsh advfirewall firewall add rule name="ServicePort11434" dir=in action=allow protocol=tcp localport=11434

This opens the firewall port on Windows 11 so that you can access the API provided by Ollama from other devices on your network.

Start WSL on Windows 11 Startup:

This step is optional but ensures that all services return if the desktop reboots.

  • Open Task Scheduler by pressing Win + S, type “Task Scheduler”, and open it.
  • Click on “Create Basic Task” in the right pane, give your task a name and description, then click “Next”.
  • Choose “When the computer starts” as the trigger, then click “Next”.
  • Select “Start a program” as the action, then click “Next”.
  • Browse to C:\Windows\System32\wsl.exe.
  • In the “Add arguments” field, enter --distribution Ubuntu (replace Ubuntu with your distribution name if different).
  • Click “Finish” to create the task.

This process sets up a scheduled task in Windows Task Scheduler to start WSL on system startup.

At this point, you should be able to start inferencing with the models being served or download your first model.

Try accessing Open WebUI @ https://localhost:8080 or whatever your IP is for your Open WebUI Docker instance.

Nginx Proxy

I created a proxy that listens on port 443 and passes the traffic to the Docker container and port 8080 for the Open WebUI GUI.

docker run -d --name nginx -p 443:443 -v ~/conf.d:/etc/nginx/conf.d -v ~/ssl:/etc/nginx/ssl --add-host=host.docker.internal:host-gateway --restart always nginx:alpine
cat << 'EOF' > ~/conf.d/open-webui.conf
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
    listen 443 ssl;
    server_name <ip_address>;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://host.docker.internal:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Timeouts
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;

        # Disable buffering for real-time responses
        proxy_buffering off;
    }
}
EOF

NOTE: Update the code to reflect the correct IP address.

For more info on Nginx

CloudFlare Tunnel:

To access my local model remotely or when away from my home network, I created a CloudFlare zero-trust tunnel. After creating an account and setting up a DNS record, I was given this Docker command with a token to run.

docker run -d cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ******eE9Ea3la**********

This command runs the Cloudflare Docker container in detached mode, enabling a tunnel to route traffic through your machine to services running inside WSL.

Visit Cloudflare for more info on Zero-Trust Tunnels

Pausing Thoughts

Now our setup is complete, and all components are in place for us to:

  • Access the Open WebUI GUI locally and remotely
  • Access Ollama via the CLI locally
  • Leverage the Ollama API locally

In today’s digital age, we constantly navigate between public and private spaces. Striking a balance is key to maintaining control and efficiency.

What is next?

Next, I plan to dive deeper into model specifics around quantization and tuning for efficiency, as well as explore the settings and features in both Ollama and Open WebUI.

6 min read
Back to Top ↑

open webui

0ri0n: My Local Private AI

Operation 0ri0n - Local AI

Recently, I found time to explore a new area and decided to delve into Data Science, specifically Artificial Intelligence and Large Language Models (LLMs).

Standalone AI Vendors

Using public and free AI services like ChatGPT, DeepSeek, and Claude requires awareness of potential privacy and data risks. These platforms may collect user input for training, leading to unintentional sharing of sensitive information. Additionally, their security measures might not be sufficient to prevent unauthorized access or data breaches.

Users should exercise caution when providing personal or confidential details and consider best practices such as encrypting sensitive data and regularly reviewing privacy policies.

Here are a few vendors that offer open-source models to the public:

Remote Private AI

Running LLMs in a private but remote setup, as shown in the GitHub repository, balances local control and scalability by using external servers or cloud resources dedicated to your organization. This approach enhances data privacy compared to public clouds while offering ease of management, performance benefits, and scalable infrastructure for handling larger workloads.

WARNING: This setup can be very expensive.

This pattern provisions infrastructure and integrates GitHub Actions for streamlined automation.

Local Home Lab AI

Running LLMs locally enhances data privacy, improves performance due to reduced network latency, and offers greater flexibility for customization and integration with on-premises systems. This setup also provides better resource control and can be cost-effective, especially for organizations with existing hardware infrastructure.

0ri0n Local AI

My Home Lab Architecture for Operation 0ri0n

Technical Document: Setting Up Open-WebUI and Ollama

To ensure optimal performance when setting up Open-WebUI and Ollama on Windows Subsystem for Linux (WSL) with GPU support, consider the following hardware components:

  1. GPU (Graphics Processing Unit):
    • A powerful NVIDIA GPU is essential for running LLMs efficiently.
      • 0ri0n: Nvidia GeForce RTX 4090 16 GB
  2. CPU (Central Processing Unit):
    • A high-performance CPU with multiple cores and robust architecture.
      • 0ri0n: Intel Core i7-14700F 2.10 GHz
  3. RAM:
    • At least 16GB of RAM, but preferably 32GB or more for smoother operation and faster model loading times.
      • 0ri0n: 32 GB

This setup will help you run Open-WebUI and Ollama effectively on your system.

By choosing Windows 11 and relying on WSL, we leverage the popularity and ease of use of a Windows environment while harnessing the power of Linux. This setup is convenient for highlighting and testing WSL capabilities.

Steps:

From Windows 11, open a PowerShell prompt as Administrator and run:

WSL Installation:

> wsl --install

This command installs Windows Subsystem for Linux (WSL) to provide a lightweight version of Linux on your Windows machine.

You should see a different prompt when WSL finishes starting.

Docker Installation:

$ curl https://get.docker.com | sh

This command downloads and runs a script that automatically installs Docker on the system.

Install NVIDIA Driver for Docker Containers:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo service docker restart

These commands download and add the necessary GPG key, configure the NVIDIA container toolkit repository, install the toolkit, and then restart Docker to use the GPU with Docker containers.

Install Open-WebUI and Ollama:

This command runs a Docker container named ollama using all available GPUs, mounts a volume for persistent storage, exposes port 11434 on both the host and the container, and sets environment variables to enable specific features like flash attention and quantization type.

$ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart=always -e OLLAMA_FLASH_ATTENTION=true -e OLLAMA_KV_CACHE_TYPE=q4_0 -e OLLAMA_HOST=0.0.0.0 ollama/ollama

This command runs a Docker container named open-webui, uses the host network mode for better performance, mounts a volume for data persistence, sets an environment variable with the OLLAMA_BASE_URL, and ensures the container restarts automatically.

NOTE: The environment variables OLLAMA_FLASH_ATTENTION and OLLAMA_KV_CACHE_TYPE enable quantization and context quantization. You can omit these from the command if you encounter issues.

$ docker run -d --network=host -v open-web:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui:main

For a quick test, try this command to see how much GPU RAM is being utilized:

$ nvidia-smi -l

Check out Ollama and Open WebUI for more specifics.

Set Up Static IP on 0ri0n:

I configured a static IP, but this is not necessary, especially if you already have DNS or DHCP implemented in your network.

Edit /etc/netplan/01-netcfg.yaml with the following configuration:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - <IP_ADDRESS>/20
      routes:
        - to: default
          via: 172.20.80.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

This YAML configuration sets up a static IP address for the network interface eth0, assigns it a specific IP, configures the default gateway, and sets DNS servers.

Expose WSL Port in Windows:

Run the following commands in PowerShell as Administrator:

> netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=<IP_ADDRESS>
> netsh advfirewall firewall add rule name="ServicePort11434" dir=in action=allow protocol=tcp localport=11434

This opens the firewall port on Windows 11 so that you can access the API provided by Ollama from other devices on your network.

Start WSL on Windows 11 Startup:

This step is optional but ensures that all services return if the desktop reboots.

  • Open Task Scheduler by pressing Win + S, type “Task Scheduler”, and open it.
  • Click on “Create Basic Task” in the right pane, give your task a name and description, then click “Next”.
  • Choose “When the computer starts” as the trigger, then click “Next”.
  • Select “Start a program” as the action, then click “Next”.
  • Browse to C:\Windows\System32\wsl.exe.
  • In the “Add arguments” field, enter --distribution Ubuntu (replace Ubuntu with your distribution name if different).
  • Click “Finish” to create the task.

This process sets up a scheduled task in Windows Task Scheduler to start WSL on system startup.

At this point, you should be able to start inferencing with the models being served or download your first model.

Try accessing Open WebUI @ https://localhost:8080 or whatever your IP is for your Open WebUI Docker instance.

Nginx Proxy

I created a proxy that listens on port 443 and passes the traffic to the Docker container and port 8080 for the Open WebUI GUI.

docker run -d --name nginx -p 443:443 -v ~/conf.d:/etc/nginx/conf.d -v ~/ssl:/etc/nginx/ssl --add-host=host.docker.internal:host-gateway --restart always nginx:alpine
cat << 'EOF' > ~/conf.d/open-webui.conf
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
    listen 443 ssl;
    server_name <ip_address>;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://host.docker.internal:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;

        # Timeouts
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;

        # Disable buffering for real-time responses
        proxy_buffering off;
    }
}
EOF

NOTE: Update the code to reflect the correct IP address.

For more info on Nginx

CloudFlare Tunnel:

To access my local model remotely or when away from my home network, I created a CloudFlare zero-trust tunnel. After creating an account and setting up a DNS record, I was given this Docker command with a token to run.

docker run -d cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ******eE9Ea3la**********

This command runs the Cloudflare Docker container in detached mode, enabling a tunnel to route traffic through your machine to services running inside WSL.

Visit Cloudflare for more info on Zero-Trust Tunnels

Pausing Thoughts

Now our setup is complete, and all components are in place for us to:

  • Access the Open WebUI GUI locally and remotely
  • Access Ollama via the CLI locally
  • Leverage the Ollama API locally

In today’s digital age, we constantly navigate between public and private spaces. Striking a balance is key to maintaining control and efficiency.

What is next?

Next, I plan to dive deeper into model specifics around quantization and tuning for efficiency, as well as explore the settings and features in both Ollama and Open WebUI.

6 min read
Back to Top ↑

vhf

DIY: 6 Meter Coax Antenna

My DIY 6 Meter Coax Antenna

Summer 2022 is almost here and I’ve been hearing about the ‘Magic Band’ and how 6 meters can be used to make regional contacts as opposed to just local contacts I usually make on 2  meter simplex.  So that means fun in the short term, but long term operating on 6 meters can come in handy during emergency situations.  

I have a TYT TH-9800D Quad band radio that can rx/tx on 6 meters pushing up to 50 watts, all I need now is an antenna.  What I have to work with is 50 feet of RG-58 coax cable and here is how I made my 6 meter antenna using that cable.

2 min read
Back to Top ↑