@BehindTheBarrier

BehindTheBarrier@programming.dev · edit-2 8 days ago

What’s fun is determining which function in that list of functions actually is the one where the bug happens and where. I don’t know about other langauges, but it’s quite inconvenient to debug one-linres since they are tougher to step through. Not hard, but certainly more bothersome.

I’m also not a huge fan of un-named functions so their functionality/conditions aren’t clear from the naming, it’s largely okay here since the conditional list is fairly simple and it uses only AND comparisons. They quickly become mentally troublesome when you have OR mixed in along with the changing booleans depending on which condition in the list you are looking at.

At the end of the day though, unit tests should make sure the right driver is returned for the right conditions. That way, you know it works, and the solution is resistant to refactor mishaps.

BehindTheBarrier@programming.dev · edit-2 18 days ago

But nothing is forcing you to check exeptions in most languages, right?

While not checking for exceptions and .unwrap() are pretty much the same, the first one is something you get by not doing anything extra while the latter is entirely a choice that has to be made. I think that is what makes the difference, and in similar ways why for example nullable enabled project in C# is desired over one that is not. You HAVE to check for null, or you can CHOOSE to assume it is not by trying to use the value directly. To me it makes a difference that we can accidentally forget about a possible exception or if we can choose to ignore it. Because problems dealt with early at compile time, are generally better than those that happen at runtime.

BehindTheBarrier@programming.dev · 19 days ago

It can be pretty convenient to throw an error and be done with it. I think for some languages like Python, that is pretty much a prefered way to deal with things.

But the entire point of Rust and Result is as you say, to handle the places were things go wrong. To force you to make a choice of what should happen in the error path. It both forces you to see problems you may not be aware of, and handle issues in ways that may not stop the entire execution of your function. And after handling the Result in those cases, you know that beyond that point you are always in a good state. Like most things in Rust, that may involve making decisions about using Result and Option in your structs/functions, and designing your program in ways that force correct use… but that a now problem instead of a later problem when it comes up during runtime.

BehindTheBarrier@programming.dev · 23 days ago

If Reddit back in the day had asked a few dollars for me to stick with using 3rd party apps using the API I would have. But they did the opposite, so here I am. First time actually donating to something, a measily $2 dollars a month, but hopefully a start to fund some of the free stuff I use.

BehindTheBarrier@programming.dev · edit-2 25 days ago

I largely agree with this nodding along to many of the pitfalls presented. Except numbers 2s good refactor. I hope I won’t sound too harsh/picky for an example that perhaps skipped renaming for clarity on the other parts, but I wanted to mention it.

While I don’t use javascript and may be missing some of the norms and context of the lanugage, creating lamda functions (i don’t know the js term) and then hardcoding them into a function is barely an improvement. It’s fine because they work well with map and filter, but it didn’t address the vague naming. Renaming is refactoring too!

isAdult is a simple function with a clear name, but formatUser and processUsers are surprisingly vague. formatUser gives only adult FormattedUsers, and that should probably be highlighted in the name of formatUser now that it is a resuable function. To me, it seems ripe for mistaken use given that it is the filter that at a glance handles removing non-adult users before the formatting, while formatUser doesn’t appear to exepct only adult users from it’s naming or even use! Ideally, formatUser should have checked the age on it’s own and set isAdult true/false accordingly, instead of assuming it will be used only on adult Users.

Likewise, the main function is called processUsers but could easily have been something more descriptive like GetAdultFormattedUsers or something similar depending on naming standards in js and the context it is used in. It may make more sense in the actual context, but in the example a FormattedUser doesn’t have to be an adult, so a function processing users should clarify that it only actually creates adult formatted users since there is a case where a FormattedUser is not an adult.

BehindTheBarrier@programming.dev · 26 days ago

He may have used the wrong word, but maintaining the same function signature across two files (while made easier by IDE tools) sucks ass to do. It was one of the major pain points for me doing a C++ course (along with the abyssmal compilation error messages). Not that I have tried Zig, but I do not see a reason to involve header files in my life if I can avoid it.

BehindTheBarrier@programming.dev · 1 month ago

Late here, but if you want the easy route then there is always Unity (C#) if it fits for your use case in game dev and the license isn’t a problem for you.

BehindTheBarrier@programming.dev · 2 months ago

Compute becomes cheaper and larger undertakings happen. LLMs are huge, but there is new tech moving things along. The key part in LLMs, the transformer is getting new competition that may surpass it, both for LLMs and other machine learning uses.

Otherwise, cheaper GPUs for us gamers would be great.

BehindTheBarrier@programming.dev · edit-2 2 months ago

The difference is, with a build pattern you are sure someone set the required field.

For example, actix-web you create a HttpResponse, but you don’t actually have that stuct until you finish the object by setting the body() or by using finish() to have an empty body. Before that point you have a builder.

There is noting enforcing you to set the input_directory now, before trying to use it. Depending on what you need, that is no problem. Likewise, you default the max_depth to a value before a user sets one, also fine in itself. But if the expectation is that the user should always provide their own values, then a .configre(max_depth, path) would make sense to finish of the builder.

It might not matter much here, but if what you need to set was more expensive struts, then defaulting to something might not be a good idea. Or you don’t need to have Option<PathBuf> and check every time you use it, since you know a user provided it. But that is only if it is required.

Lastly, builder make a lot of sense when there is a lot to provide, which would make creating a strict in a single function/line very complicated.

Example in non-rust: https://stackoverflow.com/questions/328496/when-would-you-use-the-builder-pattern

BehindTheBarrier@programming.dev · edit-2 2 months ago

It’s hyperbole, but I learned my first language because I wanted to be a god.

I saw these magic windows that popped up, that had buttons, and I was jealous of these godly creators holding the power to make them do as they wanted. So, I learned it myself. I peeked at another program I was using, it was using python and PyQt so that’s what I set out with to become my own god of the desktop.

My first program was a GUI wrapper around the YouTube-dl CLI, and I still use it frequently.

BehindTheBarrier@programming.dev · edit-2 3 months ago

I am very content with Riders “hide whitespace and newlines” diff option. Frankly after starting to use auto format on code, all old files that got messy in the diffs next time they were changed.

There’s some other nitpicks that some more aware diff could have but outside python few changes in whitespace matters, so seeing every new line is a waste and visual burden in any review for me.

BehindTheBarrier@programming.dev · edit-2 3 months ago

If CO2 is a byproduct of another process, then I’d make a guess it is fairly cheap. The flaw here is that CO2 and H2 are both products of steam reforming using methane… Which is to say, the cheaper version might just come from using natural gas. Hydrogen has to be sourced from some energy consuming process, and that too is often from the methane steam reformation. So it’s certainly possible, but yet again is ready to become yet another “green” product made from fossil fuel. Doesn’t have to be, but I can be.

Edit: to correct a discrepancy, the article mentioned hydrogen, but if the hydrgon comes from water used in the process then some of the issues of providing H2 is less big. But either way I expect this to be energy costly. Nevertheless, a lab made product is still something that doesn’t need large areas of land to produce.

BehindTheBarrier@programming.dev · 3 months ago

True, but this is solid state so it may be higher density than current Lithium based batteries. But it might not beat a hypothetical lithium solid state battery. On the other hand, sodium batteries today beat out lithium in many other ways than capacity, and if those things are true for solid state then as long as there is a big enough jump in capacity due to the solid state transition then I think sodium is going to be the go-to for most uses in the future.

BehindTheBarrier@programming.dev · edit-2 3 months ago

Quick google shows that Kanban is a method. Mainlu around picking up things as the come, but also limiting how much can happen at once.

The project I’m has a team that uses Kanban for the “Maintenance” tasks/development, take what is at the top of the board and do it. Adapt if higher priority things comes around, such as prod bugs. Our developments teams are trying to implement Scrum, where interruptions are to be avoided if possible during sprints. You plan a sprint, try to do that work, and can present it, and iterate when users inevitably changes criteria.

In the meme, kanban does somewhat make sense, since getting armrests is never going to get a high priority as part of building a rocket. Scrum isn’t exactly right, but I can see where it’s coming from. They are all agile methods though.

BehindTheBarrier@programming.dev · edit-2 3 months ago

I kinda get where he is coming for though. AI is being crammed into everything, and especially in things where they are not currently suited to be.

After learning about Machine learning, you kind realize that unlike “regular programs” that ML gives you “roughly what you want” answers. Approximations really. This is all fine and good for generating images for example, because minor details being off of what you wanted probably isn’t too bad. A chat bot itself isn’t wrong here, because there are many ways to say the same thing. The important thing is that there is a definite step after that where you evaluate the result. In simpler ML you can even figure out the specifics of the process, but for the most part we evaluate what the LLM said or if the image is accurate to our expectations. But we can’t control or constrain the output to exactly our needs, because our restrictions largely are just input in a almost finished approximation engine.

The problem is, that companies take these approximation engines, put them in their product and consider their output fact. Like Ai chatbots doing customer support, and make up facts like the user that was told about rules that didn’t exist for an airline, or the search engines that parrot jokes or harmful advice. Sure you and I might realize that these things come from a machine that doesn’t actually think about it’s answers, but others don’t. And throwing a “*this might be wrong because its AI” on it is not an acceptable waiver of accountability.

Despite this, I use chatgpt and gemini a lot to help me program, they get a lot of things wrong but also do great. It’s a great tool, exactly because I step in after the approximation step, review and decide. I’m aware of the limits. But putting these things in front of “users” without a review step means you are advertising that you are either unaware of this flaw, or just see the cost-benefit analysis and see that if noting else it’ll generate interest during the hype.

There is a huge potential, but throwing AI into a situation where facts are needed when it’s only making rough guesses, is the wrong way about it.

BehindTheBarrier@programming.dev · 3 months ago

Working on a blog(entirely for fun), found out the server backend, actix-web, does not handle early termination of a stream well. Wanted to stop an upload if the file size turned out to be too large, but you have to consume the entire upload before returning an error. If not the client will never see the connection close. I guess there is a way to check the size beforehand, but sucks that you can’t stop a stream in progress.

Apparently a long standing issue.

BehindTheBarrier@programming.dev · edit-2 6 months ago

Ah, so I’m actually cheating with the pointer reading, i’m actually making a clone of Arc<T> without using the clone()… And then dropping it to kill the data. I had assumed it just gave me that object so I could use it. I saw other double buffer implementations (aka write one place, read from another palce, and then swap them safely) use arrays with double values, but I wasn’t much of a fan of that. There is some other ideas of lock free swapping, using index and options, but it seemed less clean. So RwLock is simplest.

And yeah, if I wanted a simple blog, single files or const strings would do. But that is boring! I mentioned in the other reply, but it’s purely for fun and learning. And then it needs all the bells and whistles. Writing html is awful, so I write markdown files and use a crate to convert it to html, and along the way replace image links with lazy loading versions that don’t load until scrolled down to. Why, because I can! Now it just loads from files but if I bother later i’ll cache them in memory and add file watching to replace the cached version. Aka an idea of the issue here.

BehindTheBarrier@programming.dev · edit-2 6 months ago

Thanks for the great reply! (And sorry for that other complicated question… )

Knowing that &str is just a reference, makes sense when they are limited to compile time. The compiler naturally knows in that case when it’s no longer used and can drop the string at the appropriate time. Or never dropped in my case, since it’s const.

Since I’m reading files to serve webpages, I will need Strings. I just didn’t get far enough to learn that yet… and with that ‘Cow’ might be a good solution to having both. Just for a bit of extra performance when some const pages are used a lot.

For example code, here’s a function. Simply take a page, and constructs html from a template, where my endpoint is used in it.

pub fn get_full_page(&self, page: &Page) -> String {
        self.handler
            .render(
                PageType::Root.as_str(),
                &json!({"content-target": &page.endpoint}),
            )
            .unwrap_or_else(|err| err.to_string())
    }

Extra redundant context: All this is part of a blog I’m making from scratch. For fun and learning Rust, and Htmx on the browser side. It’s been fun finding out how to lazy load images, my site is essentially a single paged application until you use “back” or refresh the page. The main content part of the page is just replaced when you click a “link”. So the above function is a “full serve” of my page. Partial serving isn’t implemented using the Page structs yet. It just servers files at the moment. When the body is included, which would be the case for partial serves i’ll run into that &str issue.

BehindTheBarrier@programming.dev · 6 months ago

Sorry, but a long and slightly complicated question, for a hypotetical case.

I wanted to serve pages in my blog. The blog doesn’t actually exist yet (but works locally, need to find out how I can safely host it later…), but lets assume it becomes viral, and by viral i mean the entire internet has decided to use it. And they are all crazy picky about loading times…

I haven’t figued out the structure of the Page objects yet, but for the question they can be like the last question:

#[derive(Clone)]
pub struct Page<'a> {
    pub title: &'a str,
    pub endpoint: &'a str,
}

I wanted to create a HashMap that held all my pages, and when I updated a source file, the a thread would replace that page in the mapping. It’s rather trivial of a problem really. I didnt find out if I could update a mapping from a thread, so I decided to make each value something that could hould a page and have the page object replaced on demand. It made somewhat sense since I don’t need to delete a page.

There is a trivial solution. And it’s just to have each HashMap value be a RwLock with an Arc holding my large string. No lagre string copies, Arc make it shared, and RwLock is fine since any number of readers can exist. Only when writing is the readers locked. Good enough really.

But I heard about DoubleBuffers, and though, why can’t I have a AtomicPointer to my data that always exist? Some work later and I had something holding an AtomicPointer with a reference to an Arc with my Page type. But it didn’t work. It actually failed rather confusingly. It crashed as I was trying to read the title on my Page object after getting it from the Arc. It wasn’t even any thread stuff going on, reading once works, the next time it crashed.

struct SharedPointer<T> {
    data: AtomicPtr<Arc<T>>,
}

impl<T> SharedPointer<T> {
    pub fn new(initial_value: T) -> SharedPointer<T> {
        SharedPointer {
            data: AtomicPtr::new(&mut Arc::new(initial_value)),
        }
    }

    pub fn read(&self) -> Arc<T> {
        unsafe { self.data.load(Relaxed).read_unaligned() }.clone()
    }

    pub fn swap(&self, new_value: T) {
        self.data.store(&mut Arc::new(new_value), Relaxed)
    }
}

#[test]
pub fn test_swapping_works_2() {
    let page2: Page = Page::new("test2", "/test2");
    let page: Page = Page::new("test", "/test");
    let entry: SharedPointer<Page> = SharedPointer::new(page.clone());

    let mut value = entry.read();

    assert_eq!(value.title, page.title);
    value = entry.read();
    assert_eq!(value.title, page.title);

    entry.swap(page2.clone());

    let value2 = entry.read();
    assert_eq!(value2.title, page2.title);
    assert_eq!(value.title, page.title);
}

This has undefined behavior, which isn’t too surprising since I don’t understand pointers that much… and I’m actually calling unsafe code. I have heard it can produce unexpected error outside it’s block. I’m just surprised it works a little. This code sometimes fails the second assert with an empty string, crashes with access violation, or one time it gave me a comparison where some of it was lots of question marks! My best understanding is that my Page or it’s content is moved or deallocated, but odd that my Arc seems perfectly fine. I just don’t see the connection between the pointer and Arcs content causing a crash.

I may just be doing the entire thing wrong, so sticking with RwLock is much better and safer since there is no unsafe code. But I seek to know why this is so bad in the first place. What is wrong here, and is there a remedy? Or is it just fundamentally wrong?

BehindTheBarrier@programming.dev · 6 months ago

Hi,

Learning Rust and getting caught up in details, but I always want to know the whys of things, and the minor differences.

Lets start of with, is there a difference between the const value vs const reference?

// Given this little struct of mine, a Page with information about an endpoint
#[derive(Clone)]
pub struct Page<'a> {
    pub title: &'a str,
    pub endpoint: &'a str,
}

// Value
const ROOT_PAGE: Page = Page::new("Home", "/home");

// Reference
const ROOT_PAGE: &'static Page = &Page::new("Home", "/home");

Since my functions always take a reference, is there any advantage to any of them. References are read-only, but since it’s const it probably doesn’t matter. What is prefered?
I know String does allocations, while &str is a string slice or something which may be on the stack. Do I not end up making any allocations in this case since stucts are on the stack by default, and only hold the pointers to a string “slice”. Especially given how they are made in this case.
Is structs with references like this okay, this Page is constant but I’m going to make many “Pages” later based on the pages my blog has, as well as some other endpoints of course.
If the struct is cloned, is the entire string as well, or just the pointer to the string slice? I assume it does copy the entire string, since to the best of my knowledge a &str does not do any reference counting, so deleting a &str means deleting it’s content, not just the reference. Since that is the case, a clone will also copy the string.
I am contemplating adding a “body” string to my Page struct. These string will of course be large and vary depending on the page. Naturally, I do not want to end up copying the entire body string every time the Page is cloned. What is the best course here, it kind of depends on the previous question, but is an Arc the solution here? There is only reading to be done from them, so I do not need to worry about any owner holding it.