AI monitoring for projects using Dapp Staking - Proposal

To everyone who was interested in this proposal, I apologize for the late response.

Due to organizational turmoil within my company, this week has been very hectic.

I currently work for a cybersecurity company in the Web2 sector in Japan. However, the management team has become so preoccupied with the stock price that they have lost sight of their original mission, which is to protect Japan’s Web2 infrastructure. The organization is now in a state where they welcome social turmoil, and the organization is in disarray.

This week I was reminded once again of the value of Astar, one of the few blockchain projects with an admirable philosophy. I sincerely hope that Astar will not lose sight of its original purpose.

Therefore, I was only able to devote two days to Astar this week.

However, we have made progress on the development progress management function using AI, so I would like to share the progress with you.

AI Selection

We investigated DeepSeek (Standard), DeepSeek-VL, DeepSeek-Coder, Phi-3-mini, etc., and will introduce two particularly interesting AIs.

DeepSeek-VL

It has become clear that this AI is necessary to understand frequently appearing diagrams and mathematical formulas within PDF whitepapers.
To run this AI, an AWS g5.xlarge instance is required.
Therefore, the monthly cost amounts to $734.
I believe this is quite expensive for a VPS used solely to retrieve various dApp metrics.

Additionally, after reviewing a number of whitepapers from different dApps, I found that their documentation methods vary widely, and many do not even mention implementation details. This makes them unsuitable for tracking development progress.

As a result, while this AI was intriguing, we did not proceed to the verification stage.
One of the reasons we skipped testing was also the long wait time required to gain GPU access on AWS.

DeepSeek-Coder

This model cannot read PDFs, but it specializes in coding and excels at detailed code analysis that other AIs cannot.

It also does not require a GPU and has been verified to run on an AWS t3.xlarge instance (4 vCPUs/16 GB memory).

So the estimated monthly cost is about $121.47 USD.

However, this approach requires the project to additionally provide a simple functional requirements document.

That said, this is probably a correct and reasonable approach.

PoC

We have deployed DeepSeek-Coder on an AWS t3.xlarge instance.

Next, we named this proposal as a project called AstarWatch and created the following requirements specification document.

# AstarWatch Functional Requirements Specification

## Obtaining dApp Development Progress Status

* Calculate the progress rate by comparing the functional requirements document (spec.md) with the code placed in the code directory.

## Obtaining dApp User Acquisition Status

* Issue an API request to AstarNode to retrieve the number of user wallets associated with the dApp's contract address.

## Obtaining dApp Marketing Metrics

* Retrieve the number of impressions generated by the social media accounts conducting marketing for the dApp.

I wrote the following code to pass a prompt and code to DeepSeek-Coder to check if the features in spec.md are implemented.

import os
import subprocess
import re

# ===== CONFIG =====
LLAMA_CLI_PATH = "./llama.cpp/build/bin/llama-cli"
MODEL_PATH = "./llama.cpp/models/deepseek-coder-6.7b-instruct.Q4_K_M.gguf"
SPEC_PATH = "spec.md"
CODE_DIR = "code"
REPORT_PATH = "report.md"
DEBUG_LOG_PATH = "debug.log"
MAX_TOKENS = 2048
MAX_CODE_CHARS = 4000
# ==================

def parse_spec_with_bullets(path):
    with open(path, encoding="utf-8") as f:
        lines = f.readlines()

    specs = []
    current_title = ""
    current_body = []

    for line in lines:
        if line.startswith("## "):
            if current_title:
                specs.append((current_title, "\n".join(current_body).strip()))
            current_title = line.strip().replace("## ", "")
            current_body = []
        elif line.startswith("* "):
            current_body.append(line.strip("* ").strip())

    if current_title:
        specs.append((current_title, "\n".join(current_body).strip()))

    return specs

def load_code_files():
    code_map = {}

    if os.path.exists(CODE_DIR):
        for fname in os.listdir(CODE_DIR):
            if fname.endswith((".py", ".ts", ".js", ".sol", ".rs")):
                path = os.path.join(CODE_DIR, fname)
                with open(path, "r", encoding="utf-8") as f:
                    code_map[fname] = f.read()[:MAX_CODE_CHARS]

    return code_map

def run_llama(prompt):
    with open(DEBUG_LOG_PATH, "a", encoding="utf-8") as dbg:
        dbg.write("\n\n========== PROMPT ==========\n")
        dbg.write(prompt + "\n")

    result = subprocess.run([
        LLAMA_CLI_PATH,
        "-m", MODEL_PATH,
        "-p", prompt,
        "-n", str(MAX_TOKENS)
    ], stdout=subprocess.PIPE, text=True)

    with open(DEBUG_LOG_PATH, "a", encoding="utf-8") as dbg:
        dbg.write("\n\n========== RESPONSE ==========\n")
        dbg.write(result.stdout + "\n")

    return result.stdout.strip()

def evaluate(specs, code_map):
    matched = 0
    with open(REPORT_PATH, "w", encoding="utf-8") as rep:
        rep.write("# AstarWatch Progress Report (Simplified)\n\n")

        for title, detail in specs:
            found = False

            for fname, code in code_map.items():
                prompt = f"""[QUESTION]
Does the code below implement the specification?

[FORMAT]
Answer only on the first line using: 'Answer: Yes' or 'Answer: No'.

[Specification Title]
{title}

[Details]
{detail}

[Code]
{code}
"""
                answer = run_llama(prompt)

                first_line = next(
                    (line.strip() for line in answer.splitlines() if line.strip().lower().startswith("answer:")),
                    "No Answer"
                )

                if re.match(r"(?i)^Answer:\s*Yes\b", first_line):
                    matched += 1
                    rep.write(f"- ✅ {title} → `{fname}`\n")
                    found = True
                    break

            if not found:
                rep.write(f"- ❌ {title}\n")

        total = len(specs)
        rep.write(f"\n## ✅ Progress Score: {matched}/{total} ({(matched / total) * 100:.1f}%)\n")

def main():
    if os.path.exists(DEBUG_LOG_PATH):
        os.remove(DEBUG_LOG_PATH)
    if os.path.exists(REPORT_PATH):
        os.remove(REPORT_PATH)

    specs = parse_spec_with_bullets(SPEC_PATH)
    code_map = load_code_files()
    evaluate(specs, code_map)

if __name__ == "__main__":
    main()

When I checked this code (check_progress.py) itself to see if it satisfied the functional requirements (spec.md), I got the following results.

# AstarWatch Progress Report (Simplified)

- ✅ Obtaining dApp Development Progress Status → `check_progress.py`
- ❌ Obtaining dApp User Acquisition Status
- ❌ Obtaining dApp Marketing Metrics

## ✅ Progress Score: 1/3 (33.3%)

Therefore, I believe we have been able to confirm that it works as a minimal PoC.

Issues

  • This PoC took 43 minutes to execute, so there are various things to consider when thinking about incorporating it into Astar Portal, but I don’t think it’s a big problem since it’s not a function that requires immediacy.
  • Currently, only one program is supported. Unless there is a mechanism to divide it into functions or classes, input it to the AI, and integrate the results, the original purpose cannot be achieved. In addition, doing so will increase the processing time even more.
  • In order for this PoC to work properly, the project needs to have spec.md written with a minimum level of granularity, but I am a little worried whether this requirement will be accepted.
1 Like

I love reading about this, I am really passionate about it, maybe if instead of building a central system that tries to analyze metrics directly on the network, a dedicated AI could be trained:

  1. creating a framework for a specific AI using phyton.

  2. importing libraries for metrics or graphs like Pandas.

  3. Create a data system in dApps belonging to dApp Staking, github (time-based frameworks). Github can be a powerful piece and projects would not expose code for those who wish to maintain their privacy. For those projects that have their code exposed a custom tag is made, this is just one of the examples, I need to assimilate this idea better.

Note: this data would be uploaded to a CSV file, which would be imported into the framework.

  1. Train the AI based on the CSV, and make the necessary filters to improve the data, and fill gaps.

Without going too deep, I think this could be a powerful way to go: Annotate Dapps data + CSV + specific training that can detect progress. I know it can be tedious in terms of training updates, but once you have a foundation the changes won’t be that difficult.

After having the product I am sure that improvements can be implemented, the analysis I see it from the demographic point of view, the dApps would be the sample are those that occupy space on the network, you would only have to know which of them grow steadily and which do not.

2 Likes

Hello, @AstarPunks! Thank you for providing this information, it really helps us understand your background better.

Your profile and motivation definitely align perfectly with what we’re looking for in a Technical Ambassador, so I’d like to invite you to apply to become an Astar Technical Ambassador from Japan.

Please read all the information about our current program and apply here: Your connected workspace for wiki, docs & projects | Notion

Thank you for your enthusiasm!

1 Like

@Vangardem san

Thank you very much for your extremely interesting proposal.
As you mentioned, I also feel that this approach has a lot of potential for further development.

We should look into what kind of data GitHub can provide,
and whether integration with GitHub Copilot could simplify the process even more.

Analyzing methods to evaluate the future potential of dApps also sounds intriguing.
In any case, if we can provide dApp Stakers with more meaningful metrics about each dApp,
their dissatisfaction will subside, and dApp developers will be more motivated to seriously think through the flow needed to achieve their goals.

Moreover, I believe we need to present developers with a model flow that can serve as a best-practice case for dApp Staking.

1 Like

Hi @Juminstock san

Thank you.
I will read the Notion documentation carefully and apply to be Astar’s technical ambassador from Japan.

Also, does the following PoC meet our minimum needs?
Is there any mistake in the direction of this PoC?

I look forward to your feedback in determining our future direction.

1 Like

Yes I agree, automation through AI is the future, the approach you have is interesting, I like it and it can be perfectly applied to improve dApp Staking data.

1 Like

Let me know on Discord once you’ve applied.

Regarding this, your idea looks really cool and well-structured. A solid implementation could give us better insight into your project’s functionalities.

1 Like

@Juminstock san

After carefully reading the Notion documentation, I applied to be a technical ambassador from Japan.
I mentioned it in the general-buillder section of the discord, so please check it out.

Thank you for your feedback on the PoC.

I think the basic configuration is not bad, but
it is technically quite difficult to actually achieve the original purpose,
and it seems like a project that requires a high-performance VPS to build a development environment to reduce testing time.
I think it is a bit beyond the amount I can personally cover.
(If separate the class and function and pass the prompt to the AI, it will take time to the processing.)

I have a goal of building an online developer community in Japan, and I recognize that this fits into my role as a technical agent, but even after reading Notion’s documentation, this AI project seems beyond the scope of an agent and beyond what I can complete on my own in my spare time at my day job.

Should I also apply for a Blockchain Backend Engineer position at StarTale?

@AstarPunks

I appreciate you taking the time to read the document I shared and for applying as a technical ambassador for Astar.

Regarding your role and goals, let’s continue that conversation on Discord since it’s not directly relevant to this discussion thread, that way we can keep this conversation focused on your project.

If you’re interested in working for Astar, here’s a link to our current job openings: Startale | Career

We truly appreciate your enthusiasm and your willingness to contribute to Astar.

1 Like

Thank you. @Juminstock san

I am also very interested in working at Astar.
How can I contact you on Discord? Can I send a DM?
I’m wondering, it seems that I can become an agent anonymously, but is it possible to work full-time at Astar anonymously?
I want to remain anonymous in the Web3 space…

For now, I’d like to continue the conversation focusing on this project.

However, it is true that a VPS with more advanced functions is needed to move forward with this project.

With the current environment that I created at my own expense,
it takes more than 40 minutes for each test, and I think it will take even more time to test the implementation needed to achieve the goal.

I think we can keep the final cost of the VPS needed low, but the VPS needed for testing needs to be of high quality.
How can I claim for development costs?

Also, the current PoC requires the project to provide a functional requirements document,
but I have doubts as to whether this condition will be met.

Thank you for the information. I learned a lot! May I ask a bit more? Regarding Deepseek, I’m not sure if I understand correctly - Deepseek has the advantage of relatively low cost, but on the other hand, in terms of the output that comes out, are there limitations in usage? Because from what I’ve tried before, after asking for a while, sometimes the answers that come out become Chinese, and also some things it can’t answer if it’s sensitive about certain topics. =)

1 Like

@BoomBLB san

Thank you for your interesting question.

I noticed that DeepSeek tends to switch to Chinese responses when the conversation drags on.
I have confirmed that this issue can be avoided by explicitly instructing the model to answer in English every time in the prompt.

In addition, when it comes to historical, political, and topical issues, the model may avoid answering or give an answer that differs from the perspective of other countries.
This is very dangerous because it causes cultural clashes between users from different countries.

However, I do not think this is a problem specific to DeepSeek.

I recognize that each country has its own interpretation of history, and current AI systems tend to reinforce the legitimacy of their own country’s narrative.

That said, when it comes to tasks like code tracing, AI has performed very well, and as long as it is used exclusively in specific fields, it should not cause any major problems.

1 Like

I understand the point completely. Right now, we’re probably in the first phase of the AI era together. We’ll likely need to learn many things together. Anyway, if I come across any other case studies, may I ask for permission to consult with you again? Thank you very much. =)

1 Like

Of course!

I deeply respect all of you who have chosen to contribute to Astar out of the many blockchains available, and I want to support you in any way I can.

As AI and quantum computing evolve, it will be very interesting to think about the direction Astar should take in light of these developments.

I look forward to contributing to the development of Astar together with you.
Thank you very much.

By the way, I actually don’t really like calling today’s AI “AI.” lol
LLMs are LLMs.
AI created through quantum computing might have consciousness, so I’d like to call that true “AI” when the time comes. lol

1 Like

I completely agree. Someday when technology advances even further, we’ll probably be able to call it AI with even more confidence. =)

1 Like